C++ CSV/DSV Filter Library

 www.partow.net  .: Home :.   .: Links :.   .: Search :.   .: Contact :. 


Description

The C++ CSV/DSV Filter Library is a simple to use, easy to integrate and extremely efficient and fast CSV/DSV in-memory data store processing library. The DSV filter allows for the efficient evaluation of complex expressions on a per row basis upon the loaded DSV store. The accompanying example demonstrates very simple SQL-like processing capabilities for filtering user specified DSV files. Note: CSV (Comma Seperated Values), DSV (Delimiter Seperated Values)

Capabilities

The C++ CSV/DSV Filter Library has the following capabilities:

  • Arbitrarily complex expressions over columns composed of both strings and numbers
  • Selectable output columns
  • Extremely fast in-memory store
  • Memory mapped file option
  • Single header file solution requires no installation or building

C++ CSV/DSV Library License

Free use of the C++ CSV/DSV Library is permitted under the guidelines and in accordance with the MIT License.

Compatibility

The C++ CSV/DSV Library implementation is fully compatible with the following C++ compilers:

  • GNU Compiler Collection (3.5+)
  • Clang/LLVM (1.1+)
  • Microsoft Visual Studio C++ Compiler (7.1+)
  • Intel® C++ Compiler (8.x+)
  • AMD Optimizing C++ Compiler (1.2+)
  • Nvidia C++ Compiler (19.x+)
  • PGI C++ (10.x+)
  • IBM XL C/C++ (9.x+)
  • C++ Builder (XE4+)

Download

Dependencies




Example

In the following example, we have a table of OHLC values for a group of equities (GOOG and MSFT). Furthermore we may wish to perform a series of queries that will extract the rows that match various criteria.

# Date_s Symbol_s Open_n Close_n High_n Low_n Volume_n
0020090701 GOOG 424.2000 418.9900 426.4000 418.1500 2310768
0120090701 MSFT 24.0500 24.0400 24.3000 23.9600 54915127
0220090702 GOOG 415.4100 408.4900 415.4100 406.8100 2517630
0320090702 MSFT 23.7600 23.3700 24.0400 23.2100 65427699
0420090703 GOOG 408.4900 408.4900 408.4900 408.4900 0
0520090703 MSFT 23.3700 23.3700 23.3700 23.3700 0
0620090706 GOOG 406.5000 409.6100 410.6400 401.6600 2262557
0720090706 MSFT 23.2100 23.2000 23.2800 22.8700 49207638
0820090707 GOOG 408.2400 396.6300 409.1900 395.9801 3260307
0920090707 MSFT 23.0800 22.5300 23.1400 22.4600 52842412
1020090708 GOOG 400.0000 402.4900 406.0000 398.0600 3441854
1120090708 MSFT 22.3100 22.5600 22.6900 22.0000 73023306
1220090709 GOOG 406.1200 410.3900 414.4500 405.8000 3275816
1320090709 MSFT 22.6500 22.4400 22.8100 22.3700 46981174
1420090710 GOOG 409.5700 414.4000 417.3700 408.7000 2929559
1520090710 MSFT 22.1900 22.3900 22.5400 22.1500 43238698

The following are a few example queries:

  • volume >= 1000000 and symbol == 'GOOG'
  • abs(open - close) > abs(high - low)
  • avg(open,close,high,low)* volume > 10^7 and inrange('20090702',date,'20090730')
  • (open > close) and (symbol like '*FT*') and (date >= '20090101')
int main()
{
   std::string file_name = "ohlc.txt";

   dsv_filter filter;

   if (!filter.load(file_name))
      return 1;

   std::string expression = "(open > close) and (symbol like '*FT*') and (date >= '20090101')";

   filter.add_filter(expression);

   for (std::size_t row = 1; row < filter.row_count(); ++row)
   {
      if (dsv_filter::e_match == filter[row])
      {
          // do something with row...
      }
   }

   return 0;
}
                       



© Arash Partow. All Rights Reserved.