JuliaDB is a high performance database for in-memory and distributed computing. It is a coherent environment for analytics, all in Julia, for storing and computing on large distributed datasets.
Loading data into traditional environments like Python and R is convenient, but difficult to scale. Distributed computing systems like Hadoop and Spark gain scale, but compromise productivity. As a result, modern data analysis and modeling pipelines end up gluing together multiple systems and languages: SQL, Python, R, C++, Scala, Unix tools, and more. This leads to complexity, overhead, and impedance mismatches when systems don’t support the same data types or operations.
JuliaDB lets you leverage Julia’s built-in parallelism to fully utilize any machine or cluster, performing indexing, filtering, aggregation, machine learning, and more without loading full datasets into main memory. Results can be saved to distributed storage. Julia’s rich ecosystem of packages for plotting, statistics, deep learning and optimization (to name a few) interoperate with the system effortlessly. Julia is natively distributed and compiles user code to machine instructions that are as fast as C/C++ or Fortran. This combination allows sending arbitrary user code to data, wherever it lives, achieving seamless scalable computing.
All you have to do now is load your data and get to work!
The table below shows how JuliaDB fares in comparison to it’s contemporaries.
|Feature||JuliaDB||Python's pandas||R's xts||KDB||Julia's TimeSeries.jl|
|N-D Data Structure|
|Multiple typed columns|
|User Functions Compile||With Effort|
JuliaDB is an open-source package supported by Julia Computing. To scale JuliaDB smoothly in your data center or the cloud, pair it with JuliaRun.
Contact us at [email protected] for a demo, support pricing for JuliaDB alone, or for pricing options for JuliaDB with JuliaRun.