Data retrieval, manipulation, and storage in Julia.
Organizations
Resources
- Blog on The Lesser Known Normal Forms of Database Design
- Database-like ops benchmark of different languages and libraries.
- Julia-data-science : Notebooks on DS basics with Julia and why it is suitable for data science.
- Julia DataFrames Tutorial by Bogumił Kamiński.
General purpose
- DataFrames.jl : In-memory tabular data in Julia.
- DataFrameMacros.jl : an opinionated take on DataFrame manipulation in Julia with a syntax geared towards clarity, brevity and convenience.
- DataFramesMeta.jl : Metaprogramming tools for
DataFrames andAbstractDictobjects. These macros improve performance and provide more convenient syntax. - Cleaner.jl : A toolbox of simple solutions for common data cleaning problems.
- InMemoryDatasets.jl : Multithreaded package for working with tabular data in Julia.
- Kezdi.jl : Julia package for data manipulation and analysis. Juliacon 2024
- Pandas.jl : A Julia front-end to Python’s
pandaspackage. - TableOperations. : Common table operations on
Tables.jlinterface implementations. - Tables.jl : An generic interface for tables in Julia.
- TableWidgets.jl : Interactive widgets to work with tabular data in Julia.
Tabular Data
See also data-science
- Arrow.jl : Pure Julia implementation of the Apache arrow data format.
- Schemata.jl : Schema (specification of a data set) for tabular data sets in Julia.
- Feather.jl : Julia library for working with feather(v1)-formatted files.
- JuliaDB.jl : JuliaDB is a package for working with large persistent data sets.
- Tables.jl : This package provides four useful interface functions for working with tabular data in a variety of formats.
- MAT.jl : Julia module for reading MATLAB files.
- StatFiles.jl :
FileIO.jlintegration for Stata, SPSS, and SAS files. - ReadStat.jl : Read files from Stata, SAS, and SPSS.
CSV files
Reading and writing csv files.
- CSV.jl : Utility library for working with CSV and other delimited files in the Julia programming language.
- DelimitedFiles.jl : A package for reading and writing files with delimited values (Originally a Julia stdlib).
- CSVFiles.jl :
FileIO.jlintegration for CSV files. - ReadWriteDlm2.jl : CSV IO. Works like readdlm/writedlm, but using decimal comma by default. Additional supporting Date, DateTime, Time, Complex, Missing and Rational types.
Parquet files
- Parquet.jl : Julia implementation of parquet columnar file format reader and writer.
- Parquet2.jl : (another) pure Julia implementation of the parquet tabular data binary format.
HDF5 files
HDF5 format supports tables as well as heterogeneous data.
- HDF5.jl : Lib to read HDF5 format files, a widely-used file format for general data.
- HDF5Logger.jl : Allows logging individual frames of data to an HDF5 file over time.
Database API
- DBInterface : An abstract DBI interface to provide a database-independent API protocol that all database drivers can be expected to comply with.
- IndexedTables.jl : Tabular data structures where some of the columns form a sorted index. It provides the backend to
JuliaDB.jl. - JDBC.j : Julia interface to Java database drivers.
- LevelDB.jl : Julia interface to Google’s LevelDB key value database.
- Memcache.jl : Julia memcached client.
- ODBC.jl : A low-level ODBC interface for the Julia programming language. Tabular Data I/O in Julia.
- QuandlAccess.jl : convenient access to Quandl data service.
Accessing datasets
For tabular data file, see fileio
- DataDeps.jl : reproducible data setup for reproducible science.
- DataToolkit.jl : Reproducible, flexible, and convenient data management.
- FaceDatasets.jl : A package for easy access to face-related datasets.
- Faker.jl : A package that generates fake data.
- RDatasets.jl : Julia package for loading many of the datasets available in R.
- WorldBankData.jl : The World Bank data.
SQL and Relational Database Management Systems
- duckdb : an in-process SQL OLAP Database Management System with a Julia API.
- DataKnots.j : An extensible, practical and coherent algebra of query combinators.
- LibPQ.j : A Julia wrapper for the PostgreSQL libpq C library.
- MySQL.j : Julia bindings and helper functions for MariaDB/MySQL C library.
- Octo.jl : an SQL Query DSL in Julia to be used with other SQL drivers.
- SparkSQL.jl : working with Apache Spark data using just SQL.
- SQLite.j : Julia interface to the SQLite library with support for operations on DataFrames.
- SQLStrings.jl :
@sql_cmdmacro for SQL query strings.
NOSQL databases
- CQLdriver.jl : Interfacing with CQL compliant databases. Used with
DataFrames.jl. - LMDB.jl : A wrapper interface to Lightning Memory-Mapped Database (LMDB) key-value embedded data store.
- Mongoc.jl : MongoDB bindings (newer) and a wrapper around libbson, for the Julia language.
- Mongo.jl : MongoDB bindings for the Julia programming language.
- Redis.jl : A fully-featured Redis client for the Julia programming language.