Over the last six months, we have been quietly working away to complete the first official “1.0” release of Deedle, which we are delighted to announce today. There are a lot of changes and new features in the 1.0 release, including:
- Performance optimizations: e.g. when working with df.Columns, faster windowing & chunking, merging of series & frames, faster numerical per-frame operations & a few more.
- New functionality: Stats module (see intro page)
- Lots of assorted bug fixes
- The API is not 100% backwards compatible, but is a lot more uniform and should be clearer.
Not to be out-done, the RProvider also gets an update today:
- Now uses out of process execution for package/function discovery, so it is less likely to kill your Visual Studio when R misbehaves.
- Now comes with RData support: http://bluemountaincapital.github.io/FSharpRProvider/reading-rdata.html.
We are continuing to improve the RProvider, in particular developing support for interacting with remote and out-of-process R sessions. We’ve also tagged a number of enhancements that are suitable for other enthusiastic souls to pick up and contibute. See up-for-grabs.
Tags: csharp, data, dotnet, fsharp, opensource
Yesterday we announced Deedle, our new Open Source library for exploratory data analysis in C# and F#. Deedle (almost) stands for “Dotnet Exploratory Data Library”. This is a library for .NET with similar capabilities to the widely respected Pandas library for Python.
Deedle was developed by Tomas Petricek, with assistance from Adam Klein and myself.
We are finding Deedle to be extremely powerful for research. We hope others will find it similarly useful and make improvements to make it an even better package.
Tags: cassandra, time series
This week, Jake and Carl spoke at the NYCassandra tech day.
The presentation was well received, make sure to check it out on SlideShare:
Jake Luciani and Carl Yeksigian will be presenting at the NYC* Tech Day 2013. They will be talking about how BlueMountain has harnessed Cassandra to deliver a scalable time series database. If this is an interesting topic for you, make sure to register!
Tags: github, glance, opensource
Glance is a new open source project from BlueMountain Capital.
Glance, a metrics dashboard
Currently, Glance is only able to look at the data that is exposed through Graphite. We are working on adding in support Riemann’s websocket protocol, which will also enable real-time metrics to be pushed out to the dashboard.
Get your focus back
Glance hides the navigation after you have selected your dashboard. Mousing over the navigation tab will cause it to reappear.
This small feature allows more space for the metrics to take up on your screen. Once you have a dashboard selected, Glance gets out of the way so that you can focus on the metrics.
Search here, search there
The search box will search the metrics display on the page as well as those on the server. This allows quick filtering of metrics as well as finding metrics not included.
Easily define new dashboards
glance.page("cpu","CPU") .find("*.cpu") .asPercent(1.0)
All in browser
Glance uses HTML5 technologies to push the logic of metric capture and dashboard creation to the user’s browser. This enables a static page to be served by the server and quick load on the client side.
We’ve tried to use the best technologies that are suited for our use case. These technologies include:
- Bootstrap: a clean, easy-to-use Web UI layer
- d3: A data driven graphics library
- Font Awesome: An icon library for Bootstrap
- Ubuntu Font: A clean, web able font.
Glance is open source, under the Apache License. Feel free to fork Glance, submit issues on our GitHub page.
Yesterday we released 3 new open source projects on GitHub: riemann-cassandra, riemann-csharp, and riemann-health-windows. We are really excited about the Riemann project, and we have already started to integrate the project into our monitoring efforts.
Our infrastructure at BlueMountain includes Windows machines, Linux machines, and a lot of custom software. As we scale up the number of machines on our grid and integrate new software including Cassandra, we find our monitoring requirements becoming higher. Riemann fits in very well for our monitoring. Its push model allows us to be proactive instead of reactive; we can receive alerts before our users complain.
Because Riemann can feed directly into Graphite, we can have nice graphs for our historical data and a dashboard that alerts us to changing conditions. Below is a collection of graphs from our current Cassandra cluster.
Many thanks to the creator of Riemann, Kyle Kingsbury.
Tags: econometrics, fsharp, github, opensource, R, statistics
Here at BlueMountain we like to perform statistical analysis of data. The stats package R is great for doing that. We also like to use the data retrieval and processing capabilities of F#. F#’s interactive environment lends itself pretty well to data exploration, and we can also easily access our existing .NET-based libraries. Once we are done, we can build and release production-supportable applications.
Nothing on the .NET platform competes with R for statistical functionality, so we set about bridging the gap between F# and R. F# 3.0 provides a nice innovative mechanism for doing this, through Type Providers.
We have released an Open Source RProvider on github. Here’s an example of how to use it:
// Pull in stock prices for some tickers then compute returns let data = [ for ticker in [ "MSFT"; "AAPL"; "VXX"; "SPX"; "GLD" ] -> ticker, getStockPrices ticker 255 |> R.log |> R.diff ] // Construct an R data.frame then plot pairs of returns let df = R.data_frame(namedParams data) R.pairs(df)
Any of the calls above that begin R. are actually evaluated inside the R engine.
This produces a lovely pair plot like this:
While we intend to continue to enhance the provider to meet our needs, we really hope others will do the same. If you use F# and work in the statistical/econometrics space, please try it out. If you use R and are looking for a robust environment in which to develop applications, also try it (and F#) out. If you have ideas for improvements, please feel free to share them with us. And if you develop enhancements/fixes, please submit a pull request!
The RProvider is built on the RDotNet project, which handles all the gnarly interop with unmanaged data structures used by R.DLL. The Type Provider provides an easy-to-use layer on top of that to use R from F#. Many thanks go to the RDotNet author, Kosei.
We are very pleased to launch this blog, through which the members of the BlueMountain Quantitative Strategy team will share opinions, provide useful information about technology and announce open-source software.
You can read a bit about what we do by checking out this Waters Technology article: http://www.waterstechnology.com/waters/feature/2195717/mountain-lions-pursuit-of-it-innovation-keeps-bluemountain-ahead-of-the-pack.