Max

Genomics

BIG DATA SOLUTIONS FOR PRECISION MEDICINE

We provide distributed, scalable, efficient, and comprehensive informatics solutions, together with various computational tools to manipulate large and complex genomic, proteomic, and clinomic data in health care for our customers by providing high quality products and services.  We design data model and data warehouse for secure data storage, share, and management.  We also offer comprehensive computational tools for real-time identifying clinically actionable genetic variants that can be used for disease diagnosis and personalized treatment.
Solutions for leveraging genomic and clinical information to personalized medicine applications, particularly new and better genomics-based diagnostic tests and personalized treatments.
Solutions for customizing healthcare, with medical decisions for selecting appropriate and optimal therapies based on the context of a patient’s genetic content, lifestyle, and environmental data.

Artificial Intelligence

Genomics & Clinomics

Precision Medicine

Advanced artificial intelligence (AI) approaches for discovering hidden and novel knowledge from large-scale literature, data sources, diverse omic data, together with clinical data derived from electronic health records (EHRs).

Software

SeqHBase is a big data toolset developed based on Apache Hadoop and HBase infrastructure.  It is designed for analyzing family-based sequencing data to detect de novo, inherited homozygous or compound heterozygous mutations.  SeqHBase takes as input BAM files (for coverage of 3 billion sites of a genome), VCF files (for variant calls) and functional annotations (for variant prioritization).  SeqHBase works through distributed and completely parallel manner over multiple data nodes.  We applied SeqHBase to a 5-member nuclear family and a 10-member three-generation family with whole genome sequencing (WGS) data, as well as a 4-member nuclear family with whole exome sequencing (WES) data.  Analysis times were linearly scalable with the number of data nodes.  With 20 data nodes, SeqHBase took about 5 seconds for analyzing WES familial data and approximately 1 minute for analyzing the 10-member WGS familial data.  These results demonstrated SeqHBase's high efficiency and scalability.  In addition, it is distributed, customizable, and scalable based on the needs with available data volume.  As more data become available, addition of more data nodes is possible, making the system very nimble.  The newly added data nodes can be seamlessly incorporated with the existing system.  SeqHBase can be applied to manipulate and analyze millions of WGS data.

More tools are under development...

Text mining is a specialized data mining method that extracts information (e.g. facts, biological processes, or diseases) from text, such as scientific literature.  We utilized natural language processing (NLP), machine learning strategies, and Big Data infrastructure to design and develop a distributed and scalable framework to extract information, such as breast, prostate, and/or lung cancers, and then to develop prediction models to classify information extracted from more than 27,000 full-text articles downloaded from PubMed Central.  We employed three different classification algorithms, including Naive Bayes, Support Vector Machine (SVM), and Logistic Regression, to build a prediction model using 5-fold cross validation on the 27,000 full-text articles.  The framework was developed on a Big Data infrastructure, including an Apache Hadoop cluster, together with Apache Spark component and Cassandra Database.  The run time required when using Big Data platform to mine more than 27,000 full-text articles was about 5 minutes, while it took more than 10 hours without using any Big Data infrastructure.  It showed that mining large-scale biomedical articles on a Big Data infrastructure can be significantly accelerated. Accuracy, precision, or recall of predicting a cancer type using any of the three machine learning methods on 27,000 full-text articles was compatible or better than the one using other libraries, such as Weka library and TagHelper Tools.  Both the time efficiency and accuracy of our scalable framework were promising and this strategy will provide tangible benefits to medical research.
OmicShare is a collaborative work environment that enables users to easily store, manage and share all types of instrumental and analytical data files for project management in biomedical research and/or health care.  It facilitates secure information share and reduces the risk of data loss.  OmicShare has a user-friendly interface accessed through an Internet browser.  Data files are uploaded to the system underlying a robust database (The database can be any one of the relational databases, such as Oracle, MySQL, PostgreSQL, etc.  Big Data based OmicShare is under development.) by selecting, coping, or simple drag-and-drop files.  OmicShare allows users to upload/download multiple subfolders and files by a simple click.  Folders or files can be granted different permissions to other collaborators/physicians by the data supplier or system administrator.  OmicShare allows users to share files with collaborators/physicians quickly, easily, and professionally.   Users can securely and quickly navigate to the projects or shared information in which they are involved to communicate with other collaborators/physicians inside and outside their organizations, upload/download single or multiple data file(s) by one click, as well as download analyses.  Click here to evalute the software framework.
To deal with large and complex genomic, proteomic, and clinomic data in biomedical research and/or health care,  OmicShare enables users to easily store, manage and share any type of instrumental and analytical data files for project management and information share.  It facilitates secure information share and reduces the risk of data loss.
  • Easy-to-Use: OmicShare has a user friendly interface accessed through an Internet browser, such as Internet Explorer, Firefox, Safari, or Opera.  Mutiple data files can be uploaded/downloaded to/from the system by a simple click.  Data files can also be asigned different permissions to mutiple collaborators/physicians easily.
  • Supporting any type of data: OmicShare allows users to manage any type of omic data files, including instrumental raw data, image data, audio/video data, gene expression data, proteomic data, genotyping data, flow cytometry data, office documents, PDF files, and so on.
  • Quick uploading multiple data files: By simple selecting, coping, or drag-and-drop files, large amounts of multiple data files are uploaded to the system underlying a robust database.
  • Easy project management: OmicShare allows users to move data folders or files in the system for easily reorganizing project management by data supplier or system administrator.
  • Secure collaboration: OmicShare allows users to share files with collaborators/physicians quickly, easily, and professionally.  Projects, folders, or files can be granted different permissions to other collaborators/physicians by the data suppliers or system administrator.
  • Quick downloading multiple data files: Multiple subfolders and data files in one project or folder can be downloaded by one simple click.  All of the subfolders and data files are saved to the client machine with the project's original organization systematically.
  • Supporting multiple databases: All of information in OmicShare is stored in a relational database for easy maintenance.  The database can be any one of the relational databases, such as Oracle, MySQL, PostgreSQL, etc.
  • Try it online free: If you are a member of OmicShare, click here to login.  If you are not a member of OmicShare yet and would like to evaluate the OmicShare system, click here to sign up.

Please note: your browser needs to support Java running envirnoment as the OmicShare has been developed in Java.  One of the recommended browsers is Firefox.

Contact

Contact us via email or using the form below for your inquiries.
FULL NAME
EMAIL
PHONE
YOUR MESSAGE

Submit

Map Data
Map data ©2016 Google
Map DataMap data ©2016 Google
Map data ©2016 Google
Map
Satellite
Home
Software
Informatics Solution
Contact