Analysis of experimental data and design of relationship in life science

Let’s examine the data thoroughly and find out the essence
We have been analyzing biological experimental data in detail and providing feedback for quantitative mathematical modeling and experimental design. We started again as the “Data-Driven Biology” laboratory in April 2021. Data-driven biology is a field that considers and provides what is needed and what is missing to analyze experimental data. This is different from AI-driven or mathematical-driven. We will focus on pre-processing the data using our domain knowledge in biology, and then use machine learning and mathematical models to gain knowledge. We will enjoy designing relational equations that represent biological functions according to physical constraints and the laws derived from experimental data.


Oct 08 2021   Our joint research with Inagaki Lab. has been accepted to Journal of Visualized Experiments. Mr. Fujikawa contributed to the development of easy-to-use cell force estimation software.
Oct 01 2021   New member joined.
Aug 10 2021   Lab retreat @NAIST.
May 01 2021   New members joined.
Arr 01 2021   Dr. T Kokaji joined as assistant professor.
Apr 01 2021   Lab name was changed to Data-Driven Biology.
Mar 31 2121   Dr. Kunida move to Fujita Health University.
Oct 01 2020   New members joined
Sep 23 2020   Lab retreat with IkedaLab@NAIST.
Apr 30 2020   New member joined.
Nov 11 2019   Assoc Prof Sakumura and Assist Prof Kunida received Best Special Session Paper Award at APSIPA ASC 2019.
Oct 01 2019   Collaboration work with Hashimoto Lab@NAIST was accepted by Plant Physiology.
Aug 07 2019   Lab retreat with IkedaLab@NAIST.
Jul 17 2019   The homepage has been renewed.
Dec 15 2018   Koyama presented his work on feature selection algorithm in ICONIP2018.
Aug 09 2018   Lab retreat with IkedaLab@NAIST.
May 07 2018   New members joined
April 18 2018   Collaboration work with Iwadate Lab (Yamaguchi Univ) was accepted by Physical Review E.
March 14 2018   T. Yamada’s paper was published by Scientific Reports.
Nov 6 2017   Haruki Inoue joined the lab.
Oct 1 2017   Lab Homepage was launched.


Data-driven research on cell morphogenesis
Cells make decisions to move and form their own morphology by using their force. This project aims to elucidate this decision-making process from experimental data. Applying principles found in previous research (e.g., bistability from intermolecular interactions) to everything is mathematical-driven, not data-driven. We are conducting data-driven research on cell morphogenesis by carefully observing experimental data and considering physical quantities other than molecules as signals.

Data-driven research on biological tissue formation
We are trying to understand the formation of biological tissues by a variety of physical quantities. By observing biological data, we find that organisms are not robust to external stimuli, but rather resilient, able to recover from the effects of stimuli and achieve their goals. We are analyzing data and developing quantitative models for resilient somitogenesis, angiogenesis, and regulation of organ size.

Force estimation generated by cells using traction force microscopy images
Anything can be deformed only by force. Cells can also be deformed by forces. It is impossible to explain cellular deformation using only molecular information. In order to perform a data-driven study of cellular deformation, we need data on cellular forces. However, it is not possible to directly observe the cellular forces. We are developing a method to accurately estimate cellular forces in the framework of Bayesian statistics using data observed by traction force microscopy.

The technology for simultaneous measurement of multiple molecules, called omics analysis, is being developed. This has led to the discovery of many molecules involved in diseases. However, we cannot understand the mechanisms of biological phenomena by simply listing these molecules. We aim to elucidate biological systems in a data-driven manner by using multiple omics data to construct a network consisting of many molecules (trans-omics analysis).

Real-time optimal control of cellular systems using control theory and machine learning
Cells regulate cellular functions such as migration, fate determination (proliferation and differentiation), and metabolism by sensing environmental information such as growth factors and hormones and driving intracellular signaling and gene expression. We use control theory and machine learning to study analytical algorithms for artificial real-time control of cellular functions by designing environmental information optimally. Our goal is to develop a data-driven cellular control system that integrates measurement, analysis, and control.

Diagnosis of disease by machine learning using human breath gas
Alcohol concentration in a driver’s blood is measured in the breath beacuse the breath component is extracted from the blood component in the alveoli. If this is the case, then it is possible to diagnose diseases using breath tests instead of invasive tests such as blood tests. We are working on this problem in collaboration with doctors and sensor development engineers. The advancement of non-invasive, simple tests with high accuracy will lead to preventive medicine.

Analysis and mathematical model of molecular transport and localization by intracellular actin wave
Supply is the most important factor in the management of a nation and war. Similarly, the transport of materials for cells is essential for the expression of functions. Energy does not simply fall everywhere. Looking at images of neurons, we can see that actin fibers are not only involved in skeletal formation, but also actin waves are involved in the transport of materials. We will build a quantitative mathematical model of the actin wave based on experimental data and elucidate its mathematical principles.

Development of methods to integrate varying data and individually observed data (dealing with system noise)
Cells respond in different ways even under the same conditions. In addition, multiple types of molecules and phenotypes cannot be observed simultaneously, but are recorded as individually observed data. Such ” variations” and “individual observations” are major problems in biology for analysis. They are not something that experimental researchers can solve. We aim to solve them with data-driven biology.

Development of semi-automatic quantification software for cell images (dealing with observation noise)
In human society, there are no perfect laws, but we have backup systems when problems occur. Similarly, when quantifying features from biological images, there is no fully automated algorithm. An automated algorithm that is specific to particular images is an “overlearning algorithm” and is not generalizable. We are developing “semi-automatic” software that can be easily backed up by users, rather than fully automated.

Network estimation using gene expression time series data
The interaction between genes during development has been studied previously. In recent years, time series of gene expression in developmental processes have become available and more information can be handled. We are attempting to estimate gene-gene interactions using the expression time series of many types of genes. In particular, we are trying to estimate gene networks in the development of the chick neural tube and mouse oocytes.


Yuichi SAKUMURA(Professor)saku[a]
Toshiya KOKAJI(Assistant Professor)
Tatsuya YAMADA(Research associate)
Ikuko SASAGAWA(Secretary)
Katsuyuki KUNIDA(Visiting Associate Professor; Fujita Health Univ)
Doctoral program
Master's program 2 (Alphabetical order)
Master's program 1 (Alphabetical order)
YuFei WU
Zhe XU


Lab retreat 2021@NAIST 2021/8/10-11 ; w/ Ikeda Lab@NAIST

Lab retreat 2020@Koyasan 2020/9/23-24 ; w/ Ikeda Lab@NAIST

Lab retreat 2019@Tottori 2019/8/7-9 ; w/ Ikeda Lab@NAIST

Lab retreat 2018@Gujo-Hachiman 2018/8/7-9 ; w/ Ikeda Lab@NAIST

Lab retreat 2017@Awaji 2017/8/29-31 ; w/ Ikeda Lab@NAIST