Data retrieval, manipulation, and storage in Julia.

Organizations

Resources

General purpose

  • DataFrames.jl : In-memory tabular data in Julia.   - DataFrameMacros.jl : an opinionated take on DataFrame manipulation in Julia with a syntax geared towards clarity, brevity and convenience.   - DataFramesMeta.jl : Metaprogramming tools for DataFrames and AbstractDict objects. These macros improve performance and provide more convenient syntax.
  • Cleaner.jl : A toolbox of simple solutions for common data cleaning problems.
  • InMemoryDatasets.jl : Multithreaded package for working with tabular data in Julia.
  • Kezdi.jl : Julia package for data manipulation and analysis. Juliacon 2024
  • Pandas.jl : A Julia front-end to Python’s pandas package.
  • TableOperations. : Common table operations on Tables.jl interface implementations.
  • Tables.jl : An generic interface for tables in Julia.
  • TableWidgets.jl : Interactive widgets to work with tabular data in Julia.

Tabular Data

See also data-science

  • Arrow.jl : Pure Julia implementation of the Apache arrow data format.
  • Schemata.jl : Schema (specification of a data set) for tabular data sets in Julia.
  • Feather.jl : Julia library for working with feather(v1)-formatted files.
  • JuliaDB.jl : JuliaDB is a package for working with large persistent data sets.
  • Tables.jl : This package provides four useful interface functions for working with tabular data in a variety of formats.
  • MAT.jl : Julia module for reading MATLAB files.
  • StatFiles.jl : FileIO.jl integration for Stata, SPSS, and SAS files.
  • ReadStat.jl : Read files from Stata, SAS, and SPSS.

CSV files

Reading and writing csv files.

  • CSV.jl : Utility library for working with CSV and other delimited files in the Julia programming language.
  • DelimitedFiles.jl : A package for reading and writing files with delimited values (Originally a Julia stdlib).
  • CSVFiles.jl : FileIO.jl integration for CSV files.
  • ReadWriteDlm2.jl : CSV IO. Works like readdlm/writedlm, but using decimal comma by default. Additional supporting Date, DateTime, Time, Complex, Missing and Rational types.

Parquet files

Apache parquet format

  • Parquet.jl : Julia implementation of parquet columnar file format reader and writer.
  • Parquet2.jl : (another) pure Julia implementation of the parquet tabular data binary format.

HDF5 files

HDF5 format supports tables as well as heterogeneous data.

  • HDF5.jl : Lib to read HDF5 format files, a widely-used file format for general data.
  • HDF5Logger.jl : Allows logging individual frames of data to an HDF5 file over time.

Database API

  • DBInterface : An abstract DBI interface to provide a database-independent API protocol that all database drivers can be expected to comply with.
  • IndexedTables.jl : Tabular data structures where some of the columns form a sorted index. It provides the backend to JuliaDB.jl.
  • JDBC.j : Julia interface to Java database drivers.
  • LevelDB.jl : Julia interface to Google’s LevelDB key value database.
  • Memcache.jl : Julia memcached client.
  • ODBC.jl : A low-level ODBC interface for the Julia programming language. Tabular Data I/O in Julia.
  • QuandlAccess.jl : convenient access to Quandl data service.

Accessing datasets

For tabular data file, see fileio

SQL and Relational Database Management Systems

Wikipedia: SQL

  • duckdb : an in-process SQL OLAP Database Management System with a Julia API.
  • DataKnots.j : An extensible, practical and coherent algebra of query combinators.
  • LibPQ.j : A Julia wrapper for the PostgreSQL libpq C library.
  • MySQL.j : Julia bindings and helper functions for MariaDB/MySQL C library.
  • Octo.jl : an SQL Query DSL in Julia to be used with other SQL drivers.
  • SparkSQL.jl : working with Apache Spark data using just SQL.
  • SQLite.j : Julia interface to the SQLite library with support for operations on DataFrames.
  • SQLStrings.jl : @sql_cmd macro for SQL query strings.

NOSQL databases

Wikipedia: NOSQL