Breakthroughs in data capture, genome sequencing, medical imaging, and
other fields of biological research, coupled with the ubiquity of cheap
digital storage, have paved the way for massive amounts of biological
data. Research suggests that by the year 2025, between 2 and 40 exabytes of
human genomic data alone will be collected every year.
Unfortunately, most mainstream software can’t process data on that scale
efficiently, which leaves these troves of data underutilized.
Julia provides a wide variety of facilities for using and processing
data effectively. It can efficiently store data structures in memory for
quick access, but when datasets are too large to fit into memory, Julia
can employ memory mapping of data files stored on disk. This allows for
fast and efficient processing, even when memory is limited.
Data are often messy and complex. Julia doesn’t take a one-size-fits-all
approach to data structures; instead, it provides a sophisticated yet
easy to use system, where users can employ whichever structure most
efficiently and sensibly stores their data. Users are not forced to
choose from among a strict limited set of data types. When no existing
data type fits the bill, users can create their own types and define any
set of operations for them. This kind of extensibility and flexibility
is at the heart of Julia.
Here is s glimpse of how Julia is solving complex use cases in the field of Life Sciences.
Modeling Cancer Evolution
Cancer Genomics, Source: JuliaComputing.com
Researchers predominantly study the growth of tumors to interpret cancer genomes. A team of researches in UK tapped into Julia to run these tumor growth simulations. Julia not only offers them fast and easy ways to run these simulations, but it also has a vibrant community contributing to projects like BioJulia assisting these researches in taking their studies forward.
Augmedics - Medical Imaging
Medical Imaging, Source: JuliaComputing.com
Augmedics, a medical tech firm is using Julia to track and render images in real time to build 3 Dimensional images of their patients’ anatomy, an alternative equivalent of X-ray vision.
Medical Diagnosis, Source: JuliaComputing.com
Diabetic retinopathy is an eye disease that affects more than 126 million diabetics and accounts for more than 5% of blindness cases worldwide. Timely screening and diagnosis can help prevent vision loss for millions of diabetics worldwide. IBM and Julia Computing analyzed eye fundus images provided by Drishti Eye Hospitals, and a built a deep learning solution that provides eye diagnosis and care to thousands of rural Indians.
Modern systems biology and systems pharmacology, the leading scientific disciplines for biological prediction, make heavy use of ordinary, stochastic, delay, discrete, and partial differential equations. These domains require efficient solvers as simulations can be very computationally expensive. A direct feature comparison to solver suites in other languages shows that Julia’s DifferentialEquations.jl is a leader in the field for differential equation solver software.
The table below (click for pdf), created by Chris Rackauckus, a PhD student at UC Irvine and the lead developer of DifferentialEquations.jl summarizes the feature comparison mentioned above.
Julia’s flexibility also means that researchers in the field directly have congregated in the JuliaDiffEq organization to implement the newest algorithms in Julia, including algorithms which were shown to be 12 to 10^6 times more efficient on stochastic biological models than the standard methods found in other libraries.
Genome sequencing produces massive quantities of data – the human genome
consists of over 3 billion nucleotides. However, it can be stored as
just a few thousand runs using run-length encoding. This functionality
in Julia was developed by pharmaceutical scientists who helped to create
a package called RLEVectors.jl. This package facilitates vector storage
in a memory-efficient manner using run-length encoding. In benchmarks,
RLEVectors.jl is shown to be 1,000 to 65,000 times faster than similar
functionality from the R BioConductor package, as can be seen in
this comparative graphic:
As the scale of data increases, so must the scale of computation. Many
problems in the life sciences lend themselves particularly well to
parallel processing, such as the analysis of single nucleotide
polymorphisms in genome-wide association studies and simulating disease
outbreaks in epidemiological models based on individuals. Julia was
built with effortless parallelism in mind, be it on a single multicore
machine, a supercomputing cluster, or in the cloud.
Regardless of where you run Julia, well-written Julia code is fast, even
in serial. In benchmarks, its performance approaches—and in some cases
beats—that of C and Fortran, the current de facto languages for
performance-critical applications. And because Julia isn’t a statically
compiled language, there’s no waiting around for compilation before you
can run your code. This makes it easy to rapidly prototype and iterate
Recently a Julia package called Gillespie.jl was published in the
Journal of Open Source Software. It implements Gillespie’s direct method
for stochastic simulations, which is widely used in fields such as
systems biology and epidemiology, in pure Julia with no parallelism. In
benchmarks, it’s shown to be over 500 times faster than the equivalent
package for R, and over 600 times faster than hand-written R code for
the same tasks. Amazingly, no special optimization tricks were
used to achieve this huge gain in performance; Gillespie.jl is fast
simply by virtue of being built on Julia.
We believe that Julia is not just the language of the future, but also
the language of now. It’s a modern solution for modern problems, with
the ability to adapt to new challenges with ease. That’s why we feel
it’s the right choice for the life sciences industry and research.
Julia is on it’s way to expand its general biostatistics toolkit to include methodologies
such as Cox proportional hazards regression and Kaplan-Meier estimated survival.
Methods common in epidemiology, such as generalized estimating
equations, will also be implemented soon.
Julia’s compliance with 21 CFR Part 11 will be documented to show that
it’s ready to take on the rigorous needs of clinical trials. Also
crucial for clinical trials is the ability to summarize data into
production-quality tables, listings, and figures, and save them in
common formats such as RTF and PDF. Anyone who has created an adverse
events table for a clinical trial has depended on such functionality
from a software package; having this functionality available in Julia
will be critical for driving adoption of Julia for clinical trials
The Julia community is doing amazing things. We hope you join us too.
Need help with Julia?
We also provide training and consulting services
and build open source or proprietary packages
for our customers on a consulting basis. Email us:
Julia Computing was founded by all the creators
of the language to provide commercial support
to Julia users. We are based in Boston, New York,
San Francisco, London and Bangalore with
customers across the world.
© 2016 - 2020 Julia Computing, Inc. All Rights Reserved.