The free/open source approach has grown from a minor activity to become a significant producer of robust, task-orientated software for a wide variety of situations and applications. To life science informatics groups, these systems present an appealing proposition - high quality software at a very attractive price. Open source software in life science research considers how industry and applied research groups have embraced these resources, discussing practical implementations that address real-world business problems.The book is divided into four parts. Part one looks at laboratory data…mehr
The free/open source approach has grown from a minor activity to become a significant producer of robust, task-orientated software for a wide variety of situations and applications. To life science informatics groups, these systems present an appealing proposition - high quality software at a very attractive price. Open source software in life science research considers how industry and applied research groups have embraced these resources, discussing practical implementations that address real-world business problems.The book is divided into four parts. Part one looks at laboratory data management and chemical informatics, covering software such as Bioclipse, OpenTox, ImageJ and KNIME. In part two, the focus turns to genomics and bioinformatics tools, with chapters examining GenomicsTools and EBI Atlas software, as well as the practicalities of setting up an 'omics' platform and managing large volumes of data. Chapters in part three examine information and knowledge management, covering a range of topics including software for web-based collaboration, open source search and visualisation technologies for scientific business applications, and specific software such as DesignTracker and Utopia Documents. Part four looks at semantic technologies such as Semantic MediaWiki, TripleMap and Chem2Bio2RDF, before part five examines clinical analytics, and validation and regulatory compliance of free/open source software. Finally, the book concludes by looking at future perspectives and the economics and free/open source software in industry.
Lee Harland is currently leading the information engineering group at Pfizer - a group tasked with developing cutting edge software that helps scientists use internal and external information more effectively. He is also leading member of the pharma-industry pre-competitive group, the Pistoia Alliance, and has 13 years' experience in bioinformatics, software development and information science within major Pharma.
Mark Forster is currently a senior information domain specialist within the Syngenta R&D Information Systems (RDIS) group, supporting R&D scientists in the fields of small molecule discovery and development, plant breeding and biotechnology. He has 15 years of industrial experience in scientific software development, deployment and support in the US and the UK.
Inhaltsangabe
Dedication
List of figures and tables
Foreword
About the editors
About the contributors
Introduction
Chapter 1: Building research data handling systems with open source tools
Abstract:
1.1 Introduction
1.2 Legacy
1.3 Ambition
1.4 Path chosen
1.5 The 'ilities
1.6 Overall vision
1.7 Lessons learned
1.8 Implementation
1.9 Who uses LSP today?
1.10 Organisation
1.11 Future aspirations
Chapter 2: Interactive predictive toxicology with Bioclipse and OpenTox
Abstract:
2.1 Introduction
2.2 Basic Bioclipse-OpenTox interaction examples
2.3 Use Case 1: Removing toxicity without interfering with pharmacology
2.4 Use Case 2: Toxicity prediction on compound collections
2.5 Discussion
2.6 Availability
Chapter 3: Utilizing open source software to facilitate communication of chemistry at RSC
Abstract:
3.1 Introduction
3.2 Project Prospect and open ontologies
3.3 ChemSpider
3.4 ChemDraw Digester
3.5 Learn Chemistry Wiki
3.6 Conclusion
3.7 Acknowledgments
Chapter 4: Open source software for mass spectrometry and metabolomics
Abstract:
4.1 Introduction
4.2 A short mass spectrometry primer
4.3 Metabolomics and metabonomics
4.4 Data types
4.5 Metabolomics data processing
4.6 Metabolomics data processing using the open source workflow engine, KNIME
4.7 Open source software for multivariate analysis
4.8 Performing PCA on metabolomics data in R/KNIME
4.9 Other open source packages
4.10 Perspective
4.11 Acknowledgments
Chapter 5: Open source software for image processing and analysis: picture this with ImageJ
Abstract:
5.1 Introduction
5.2 ImageJ
5.3 ImageJ macros: an overview
5.4 Graphical user interface
5.5 Industrial applications of image analysis
5.6 Summary
Chapter 6: Integrated data analysis with KNIME
Abstract:
6.1 The KNIME platform
6.2 The KNIME success story
6.3 Benefits of 'professional open source'
6.4 Application examples
6.5 Conclusion and outlook
6.6 Acknowledgments
Chapter 7: Investigation-Study-Assay, a toolkit for standardizing data capture and sharing
Abstract:
7.1 The growing need for content curation in industry
7.2 The BioSharing initiative: cooperating standards needed
7.3 The ISA framework - principles for progress
7.4 Lessons learned
7.5 Acknowledgments
Chapter 8: GenomicTools: an open source platform for developing high-throughput analytics in genomics
Abstract:
8.1 Introduction
8.2 Data types
8.3 Tools overview
8.4 C++ API for developers
8.5 Case study: a simple ChIP-seq pipeline
8.6 Performance
8.7 Conclusion
8.8 Resources
Chapter 9: Creating an in-house â?Tomics data portal using EBI Atlas software
Abstract:
9.1 Introduction
9.2 Leveraging 'omics data for drug discovery
9.3 The EBI Atlas software
9.4 Deploying Atlas in the enterprise
9.5 Conclusion and learnings
9.6 Acknowledgments
Chapter 10: Setting up an â?Tomics platform in a small biotech
Abstract:
10.1 Introduction
10.2 General changes over time
10.3 The hardware solution
10.4 Maintenance of the system
10.5 Backups
10.6 Keeping up-to-date
10.7 Disaster recovery
10.8 Personnel skill sets
10.9 Conclusion
10.10 Acknowledgements
Chapter 11: Squeezing big data into a small organisation
Abstract:
11.1 Introduction
11.2 Our service and its goals
11.3 Manage the data: relieving the burden of data-handling