Project Description
A simple C# API for loading tabular dataframes into Microsoft SQL Server database using only a small number of tables to represent any kind of dataframe.

Data Frame Loader provides is a simple C# library for loading tabular, ad-hoc data into a fixed number of tables in Microsoft SQL Server and then to easily consume this data back in your .NET code. The API is intentionally super simple to use, utilizing .NET DataTables and super simple relational schema in Microsoft Sql Server. Although the data is stored in a meta-format supporting an arbitrary number of source data schemas, you, the developer only have to work with straightforward DataTables.

Change Log

2014-08-29 - Added DataFameInfoProps and the corresponding changes in the .net API for storing adhoc metadata associated with which DataFrameInfo object. Simple string key-values. This is useful for then efficiently querying for specific subsets of data. You can associated things like Knowledge Time, As-Of Date, Batch Id, and any other concept that is common across all the rows stored in a given dataframe.


To Load Data
// assuming you uploaded a dataframe with name YahooPrices
// 0. get an instance of dfloader DbContext
var db = new DbContext(sqlConn);
// 1. get the id of the latest upload
var id = db.GetDataFrameInfos()
                 .Where(f => f.Schema.Name == "YahooPrices")
                 .Select(f => f.Id).Last();
// 2. load the data
DataTable table = db.GetDataFrameById(id);

atom_small.gifGain control over your adhoc datasources.

Data Frame Loader works great when you have a large number of adhoc csv files that need to be consumed by your .net application, but maintaining the library of adhoc files is becoming a burden. By loading the data into a simple database schema, you introduce a level of control and predictability. In addition when loading the data from the file into the database you have an opportunity to perform simple data validation.

A very simple schema is used to model any kind of data.


atom_small.gif Features

The following features are supported:
  • works great in controlled database environments where you are restricted from creating adhoc tables for storing data with adhoc schema. Data Frame Loader uses a fixed number of tables (four tables in all).
  • can do simple validation of data upon upload.
  • can version the datasets (including schema)
  • schema versioning
  • required/active keys
  • super simple schema built on top of Microsoft SQL Server, allows you to add functionality at the database level as needed. For example you can add more complex constraints, triggers, etc.
  • sql server view to allow tabular projection of the data into a form that can be joined with in your own sql queries.
  • GUI to load data
  • Console app for running from command line and makes it easy to schedule as a service.
  • FAST - uses sql bulk copy to load a couple of megabytes of data in several seconds.

atom_small.gif Overall Architecture


Last edited Aug 29, 2014 at 4:08 PM by privosoftllc, version 14