Data Science
Data retrieval, manipulation, and storage in Julia.
- Julia ML organization
- Julia Data organization
- Julia Databases organization
- Julia stats organization
- Blog on The Lesser Known Normal Forms of Database Design
- Database-like ops benchmark of different languages and libraries.
- Julia-data-science : Notebooks on DS basics with Julia and why it is suitable for data science.
- Julia DataFrames Tutorial by Bogumił Kamiński.
General purpose¶
- JuliaData/DataFrames.jl : In-memory tabular data in Julia.
- jkrumbiegel/DataFrameMacros.jl : an opinionated take on DataFrame manipulation in Julia with a syntax geared towards clarity, brevity and convenience.
- JuliaData/DataFramesMeta.jl : Metaprogramming tools for
DataFrame
s andAbstractDict
objects. These macros improve performance and provide more convenient syntax. - JuliaData/Tables.jl : An generic interface for tables in Julia.
- JuliaPy/Pandas.jl : A Julia front-end to Python's
pandas
package.
Data Manipulation¶
- JuliaData/TableOperations.jl : Common table operations on
Tables.jl
interface implementations. - sl-solution/InMemoryDatasets.jl : Multithreaded package for working with tabular data in Julia.
- TheRoniOne/Cleaner.jl : A toolbox of simple solutions for common data cleaning problems.
DataBase API¶
- jerryzhenleicai/LevelDB.jl : Julia interface to Google's LevelDB key value database.
- JuliaData/IndexedTables.jl : Tabular data structures where some of the columns form a sorted index. It provides the backend to
JuliaDB.jl
. - JuliaDatabases/DBInterface.jl : An abstract DBI interface to provide a database-independent API protocol that all database drivers can be expected to comply with.
- JuliaDatabases/JDBC.jl : Julia interface to Java database drivers.
- JuliaDatabases/ODBC.jl : A low-level ODBC interface for the Julia programming language. Tabular Data I/O in Julia.
- tanmaykm/Memcache.jl : Julia memcached client.
- tk3369/QuandlAccess.jl : convenient access to Quandl data service.
SQL and Relational Database Management Systems¶
- duckdb/duckdb : an in-process SQL OLAP Database Management System with a Julia API.
- invenia/LibPQ.jl : A Julia wrapper for the PostgreSQL libpq C library.
- JuliaComputing/SQLStrings.jl :
@sql_cmd
macro for SQL query strings. - JuliaDatabases/MySQL.jl : Julia bindings and helper functions for MariaDB/MySQL C library.
- JuliaDatabases/SQLite.jl : Julia interface to the SQLite library with support for operations on DataFrames.
- MechanicalRabbit/DataKnots.jl : An extensible, practical and coherent algebra of query combinators.
- propelledanalytics/SparkSQL.jl : working with Apache Spark data using just SQL.
- wookay/Octo.jl : an SQL Query DSL in Julia to be used with other SQL drivers.
NOSQL databases¶
- felipenoris/Mongoc.jl : MongoDB bindings (newer) and a wrapper around libbson, for the Julia language.
- JuliaDatabases/Redis.jl : A fully-featured Redis client for the Julia programming language.
- r3tex/CQLdriver.jl : Interfacing with CQL compliant databases. Used with
DataFrames.jl
. - ScottPJones/Mongo.jl : MongoDB bindings for the Julia programming language.
- wildart/LMDB.jl : A wrapper interface to Lightning Memory-Mapped Database (LMDB) key-value embedded data store.
Accessing datasets¶
For tabular data file: fileio
- 4gh/WorldBankData.jl : The World Bank data.
- dfdx/FaceDatasets.jl : A package for easy access to face-related datasets.
- JuliaStats/RDatasets.jl : Julia package for loading many of the datasets available in R.
- neomatrixcode/Faker.jl : A package that generates fake data.
- oxinabox/DataDeps.jl : reproducible data setup for reproducible science.
- tecosaur/DataToolkit.jl : Providing a data CLI for reproducible, flexible, and convenient data management.
- tecosaur/DataToolkit.jl : Reproducible, flexible, and convenient data management.