Date  Time  Location  Instructors  Institution  Title and Description 
Nov 20, 2010 
2:005:00pm 
Genomics Lecture Hall  Tyler Backman & Thomas Girke  UCR 
Analysis of Small Molecule Data with R and Bioconductor Manual for this workshop Description: This workshop introduces the ChemmineR package for mining druglike compound and screening data sets. The new version of this R package contains functions for handling/analyzing SDF/MOL files, bioactivity data from PubChem, structural similarity searching, clustering of compound libraries with a wide spectrum of algorithms and utilities for managing complex compound data sets. In addition, it offers visualization functions for compound clusters and chemical structures. The package is well integrated with the online ChemMine Tools service and allows bidirectional communications between the two services. The integration of chemoinformatic tools with the R programming environment has many advantages, such as easy access to a wide spectrum of statistical methods, machine learning algorithms and graphic utilities. Knowledge of the R software, as introduced in the "Introduction to R" course, will be required for attending this workshop. Maximum number of participants: 40 
Nov 20, 2010 
10:001:00pm  Genomics Lecture Hall  Thomas Girke 
UCR 
Introduction to R Manual for this workshop Description: R (http://www.rproject.org) is a versatile data analysis environment that has a broad application spectrum in all science areas. The associated Bioconductor project provides access to hundreds of additional R packages for the analysis of modern biological and biomedical data sets, such as microarrays, next generation sequencing data, genome annotations, networks, etc. The R software is completely free and runs on all common operating systems. This workshop provides an elementarylevel introduction into the R environment to prepare users with the knowledge required for the subsequent events of this R workshop series. The following topics will be covered by this R introduction: (1) command syntax, (2) basic functions, (3) data import/export, (4) data/object types, (5) graphical display, (6) usage of R packages/libraries (e.g. Bioconductor) and (7) using R for basic data mining operations. Maximum number of participants: 40 
Oct 31, 2010 
10:001:00pm  Genomics Lecture Hall  Tyler Backman, Rebecca Sun & Thomas Girke 
UCR 
GUIbased Exploration and Visualization of Next Generation Sequence Data Manual for this workshop Description: This workshop will introduce the basics of aligning next generation sequence (NGS) data to reference genomes/transcriptomes using cloud/webbased applications. Analysis and visualization of the read pileups along with annotation information will be performed in the free and very easytouse IGV genome browser from the Broad Institute. The material will be useful for both complete beginners and intermediate users (e.g. attended previous R workshop on NGS data analysis). No special computer knowledge is required for this workshop. Maximum number of participants: 40 
Oct 30, 2010 
10:004:00pm  Genomics Lecture Hall  Tyler Backman, Rebecca Sun & Thomas Girke  UCR 
Analysis of Next Generation Sequencing Data with R and Bioconductor Manual for this workshop Description: R and Bioconductor provide extensive utilities for analyzing next generation sequence data (NGS) from technologies such as Illumina (Solexa). This workshop will cover the following topics: (1) basic sequence and string handling; (2) quality assessment utilities; (3) quality filtering; (4) adaptor trimming; (5) sequence alignments; (6) interacting with external alignments programs from R, e.g. SOAP, Maq, Bowtie; (7) read density and SNP analysis; and (8) visualization of genomescale mapping data. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be expected in this workshop. Due to the computational demands of analyzing next generation sequencing data with reasonable speed, users will work during this workshop on a Linux cluster. Maximum number of participants: 40 
Oct 29, 2010 
4:005:00pm  Genomics Lecture Hall  Alex Levchuk, Tyler Backman, Thomas Girke  UCR 
Linux Part II: Using IIGB's Linux Cluster Manual for this workshop Instructors: Alex Levchuk, Tyler Backman & Thomas Girke (UCR) Description: This seminarstyle presentation will provide an introduction into the usage of the different load balancing and parallel computing tools available on IIGB's Linux cluster. A discussion will follow to determine the need for future hardware and software upgrades. PI's and users from UCR are invited to attend this event. Maximum number of participants: 40 
Oct 29, 2010 
1:004:00pm  Genomics Lecture Hall 
Alex Levchuk, Tyler Backman, Thomas Girke  UCR 
Linux Part I: Linux Essentials Manual for this workshop Description: The majority of freely available bioinformatics software is designed for Unix/Linuxbased operating systems. Basic knowledge about its usage provides free access to the most powerful and uptodate applications in the field. The workshop will teach beginners the basic commandline syntax for running applications on large data sets on our LINUX servers and clusters from a local Windows, Mac or Linux computer. The following topics will be covered: (1) overview of the Linux operating system, (2) file system organization, (3) getting around, (4) basic Shell commands and scripts, (5) available software, (6) running software like Bowtie, Soap, BLAST, HMMER, PHYLIP, EMBOSS, etc. Maximum number of participants: 40 
Oct 3, 2010 
2:005:00pm  Genomics Lecture Hall 
Thomas Girke 
UCR 
Clustering and Data Mining in R Manual for this workshop Description: R contains a comprehensive set of functions and libraries for clustering large multidimensional data sets. This course will provide an overview on the usage of the most common clustering techniques in R, such as hierarchical clustering with bootstrap, Kmeans, PAM, fuzzy clustering, QT, SOM, principal component analysis, multidimensional scaling, biclustering, etc. Knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be required for attending this workshop. Maximum number of participants: 40 
Oct 3, 2010 
10:001:00pm  Genomics Lecture Hall 
Thomas Girke 
UCR 
Microarray Analysis with R & Bioconductor Manual for this workshop Description: The statistics software R and the associated BioConductor project have become the "gold standard" for the analysis of dual color microarrays and Affymetrix chips. The environment integrates the most advanced analysis tools that are currently available for profiling data. All software components are freely available for all operating systems. This workshop will cover the following topics: (1) data import/export, (2) background correction and normalization procedures for Affymetrix and cDNA arrays, (3) array quality inspection, (4) identification of differentially expressed genes, (4) visualization of genomic information, (5) overview on clustering methods and (6) Gene Ontology (GO) analysis. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be expected in this workshop. Maximum number of participants: 40 
Oct 2, 2010 
10:003:00pm  Genomics Lecture Hall 
Thomas Girke 
UCR 
Programming in R Manual for this workshop Description: One of the outstanding strengths of the R language is the ease of programming extensions to automate the analysis and mining of almost any type of data. This workshop provides an overview of the basic knowledge for writing beginner level programs in R. The following topics will be introduced: (1) conditional executions, (2) loops, (3) writing functions, (4) techniques for improving speed/memory performance, (5) calling external software, (6) running and debugging R programs, and (7) objectoriented programming in R. Knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be required for attending this workshop. Maximum number of participants: 40 
Oct 1, 2010 
2:006:00pm  Genomics Lecture Hall 
Thomas Girke 
Mar 6, 2010 
2:005:00pm 
Genomics Lecture Hall  Thomas Girke  UCR 
Mar 6, 2010 
10:0012:00pm  Genomics Lecture Hall  Thomas Girke 
UCR 
Mar 5, 2010  2:005:00pm  HMNSS1500 (Humanities)  Thomas Girke 
UCR 
Mar 5, 2010  9:0012:00pm  Genomics Lecture Hall 
Thomas Girke 
Feb 26, 2010  2:006:00pm 
Genomics Lecture Hall  Tyler Backman, Rebecca Sun & Thomas Girke  UCR 
Feb 19, 2010 
4:005:00pm 
Genomics Lecture Hall  Alex Levchuk, Tyler Backman, Thomas Girke  UCR 
Linux Part II: Using IIGB's Linux Cluster Manual for this workshop Description: This seminarstyle presentation will provide an introduction into the usage of the different load balancing and parallel computing tools available on IIGB's Linux cluster. A discussion will follow to determine the need for future hardware and software upgrades. PI's and users from UCR are invited to attend this event. Maximum number of participants: 40 
Feb 19, 2010 
1:004:00pm  Genomics Lecture Hall 
Alex Levchuk, Tyler Backman, Thomas Girke 
UCR 
Jan 30, 2010  10:003:00pm 
Genomics Lecture Hall  Thomas Girke 
UCR 
Jan 29, 2010  2:006:00pm  Genomics Lecture Hall 
Thomas Girke 
July 23, 2009  2:307:00pm 
1104 Batchelor Hall  Tyler Backman & Thomas Girke  UCR 
June 25, 2009  2:307:00pm  1104 Batchelor Hall 
Thomas Girke 
UCR 
June 5, 2009  2:307:00pm  1104 Batchelor Hall  Thomas Girke 
July 30, 2008  2:307:00pm  1104 Batchelor Hall  Thomas Girke  UCR  Introduction to R Manual for this workshop Description: The open source software R (http://www.rproject.org) has revolutionized the statistical data analysis for most bioscience and chemistry disciplines. The required time to learn the R software is well invested, since the R environment covers an unmatched spectrum of statistical tools including an efficient programming language for automating timeconsuming analysis routines. The fully integrated BioConductor project contains many additional R packages, in particular for the analysis of functional genomics and microarray data. Due to their popularity, R and BioConductor are continuously updated and extended with the latest analysis tools that are available in the different research fields. The R environment is completely free and runs on all common operating systems. This workshop provides an elementarylevel introduction into the R environment covering the following topics: (1) command syntax, (2) basic functions, (3) data import/export, (4) data types, (5) using R for data mining, (6) graphical display and (7) usage of R packages and libraries (e.g. BioConductor). Maximum number of participants: 20 
Sept 27, 2007  2:006:00pm  1007 Noel T. Keen Hall  Thomas Girke  UCR  Introduction to R Manual for this workshop Description: The open source software R (http://www.rproject.org) has revolutionized the statistical data analysis for most bioscience and chemistry disciplines. The required time to learn the R software is well invested, since the R environment covers an unmatched spectrum of statistical tools including an efficient programming language for automating timeconsuming analysis routines. The fully integrated BioConductor project contains many additional R packages, in particular for the analysis of functional genomics and microarray data. Due to their popularity, R and BioConductor are continuously updated and extended with the latest analysis tools that are available in the different research fields. The R environment is completely free and runs on all common operating systems. This workshop provides an elementarylevel introduction into the R environment covering the following topics: (1) command syntax, (2) basic functions, (3) data import/export, (4) data types, (5) using R for data mining, (6) graphical display and (7) usage of R packages and libraries (e.g. BioConductor). Maximum number of participants: 20 
April 18, 2006  8:002:00pm  Loma Linda University  Bioinformatic Specialists  NCBI  NCBI MiniCourse Flyer for this workshop Description: (A) Making Sense of DNA & Protein Sequence: Participants will find a gene within a eukaryotic DNA sequence, predict the function of the implied protein product, and find a 3D modeling template for this protein sequence using NCBI resources. (B) BLAST QuickStart: A practical introduction to the BLAST family of sequencesimilarity search programs. Participants will perform simple and specialized searches and learn creative uses of BLAST programs. Organizer: Aileen Gonzales (LLU) Maximum number of participants: 12 
Jul 7, 2005  3:006:30pm  1007 Noel T. Keen Hall  Thomas Girke  UCR  Introduction into EMBOSS: A Free Open Source Sequence Analysis Package Manual for this workshop Description: The only free and comprehensive sequence analysis packages is EMBOSS. It contains over 150 very useful commandline tools for analyzing DNA and protein sequences including pattern searching, phylogenetic analysis, data management, feature predictions, proteomics and more. A detailed description of all its applications can be found on this page. The workshop will provide an introduction into the functionality and usage of the different EMBOSS modules. Knowledge of the basics UNIX commands, as introduced in our 'LINUX Essentials' course, is required for attending this workshop. Maximum number of participants: 12 
Apr 28, 2005  1:004:30pm  1007 Noel T. Keen Hall  Thomas Girke  UCR  Expression Profiling Analysis with R and Bioconductor Manual for this workshop R (http://www.rproject.org) is a complete statistical software package and programming language for data manipulation, calculation and professional graphical display. The fully integrated Bioconductor project covers many additional R packages for statistical data analysis in biosciences, such as tools for the analysis of SNP and transcriptional profiling data derived from SAGE, cDNA microarrays, Affymetrix chips, etc. This workshop will be divided into two sections: the first part will provide a short introduction into R and the second part will focus on the usage of Bioconductor packages for the analysis of Affymetrix chips and dualcolor microarrays (RMA, GCRMA, LIMMA, SAM, etc). Maximum number of participants: 12 
Mar 2930, 2005  1:004:00pm  Loma Linda University  Bioinformatic Specialists  NCBI  NCBI Field Guide Workshop Schedule: Download Flyer Directions to lecture and lab rooms Maximum number of participants: 12 
Mar 17, 2005  2:005:30pm  1007 Noel T. Keen Hall  Thomas Girke  UCR  Expression Profiling Analysis with R and Bioconductor Manual for this workshop R (http://www.rproject.org) is a complete statistical software package and programming language for data manipulation, calculation and professional graphical display. The fully integrated Bioconductor project covers many additional R packages for statistical data analysis in biosciences, such as tools for the analysis of SNP and transcriptional profiling data derived from SAGE, cDNA microarrays, Affymetrix chips, etc. This workshop will be divided into two sections: the first part will provide a short introduction into R and the second part will focus on the usage of Bioconductor packages for the analysis of Affymetrix chips and dualcolor microarrays (RMA, GCRMA, LIMMA, SAM, etc). Maximum number of participants: 12 
Feb 2223, 2005  9:005:00pm  City of Hope  Bioinformatic Specialists  NCBI  NCBI Workshops First day: "A Field Guide to GenBank & NCBI Molecular Biology Resources" Second day: "Exploring 3D Molecular Structures Using NCBI Tools & NCBI QuickScripts" Sign up fee per computer lab session: $25 Maximum number of UCR participants: 15 
Oct 29, 2004  2:005:30pm  1007 Noel T. Keen Hall  Thomas Girke  UCR  Introduction into R and Bioconductor Manual for this workshop R (http://www.rproject.org) is a complete statistical software package and programming language for data manipulation, calculation and professional graphical display. The fully integrated Bioconductor project covers many additional R packages for statistical data analysis in biosciences, such as tools for the analysis of SNP and transcriptional profiling data derived from SAGE, cDNA microarrays, Affymetrix chips, etc. This workshop will be divided into two sections: the first part will provide an introduction into the basic R commands under Linux and the second part will focus on the usage of Bioconductor packages for Affymetrix chip analysis (RMA, GCRMA, QC display, SAM, etc). Knowledge of the basics UNIX commands, as introduced in our 'LINUX Essentials' course, is required for attending this workshop. Maximum number of participants: 12 
July 22, 2004  2:005:30pm  1007 Noel T. Keen Hall  Thomas Girke & Josh Lauricha  UCR  LargeScale Computing on our Bioinfo LINUX Cluster Manual for this workshop Our facility is currently maintaining a 64CPU LINUX cluster to significantly reduce the running time of computationally expensive bioinformatics applications. For instance HMM searches of 100,000 protein sequences against the Pfam database can be finished on this cluster within 34 days as opposed to 'impractical' ~200 days on a single processor machine! This seminar will provide an introduction into the usage of the different load ballancing and parallel computing systems that are available on our cluster. A discussion will follow to determine the need for future hardware growth and software requirements in this area. Maximum number of participants: 12 
July 8, 2004  2:005:30pm  1007 Noel T. Keen Hall  Thomas Girke  UCR  Introduction into EMBOSS: A Free Open Source Sequence Analysis Package Manual for this workshop The only free and comprehensive sequence analysis packages is EMBOSS. It contains over 150 very useful commandline tools for analyzing DNA and protein sequences including pattern searching, phylogenetic analysis, data management, feature predictions, proteomics and more. A detailed description of all its applications can be found on this page. The workshop will provide an introduction into the functionality and usage of the different EMBOSS modules. Knowledge of the basics UNIX commands, as introduced in our 'LINUX Essentials' course, is required for attending this workshop. Maximum number of participants: 12 
April 15, 2004  2:003:30pm  1007 Noel T. Keen Hall  Jennifer Le Page  ChemBridge  ChemBridge & CRL: Integrative Chemistry Solutions for Drug Discovery Topics of presentation:

Mar 25, 2004 Apr 5, 2004 
2:005:30pm  1007 Noel T. Keen Hall  Thomas Girke  UCR  Large Scale Data Management for Biologists with the Database Software MS Access Manual and execises for this workshop In response to the high demand, the bioinformatics facility is offering a workshop for biologists with interest in learning a userfriendly database software for managing complex data sets from DNA array, proteomic, largescale sequencing and other highthroughput technologies. MS Access provides a simple but efficient database environment for organizing large data sets without knowing any programming languages. It is also a very useful tool to predesign data structures for future import into more powerful database engines like MySQL, PostgreSQL or Oracle. The software is usually preinstalled on every Windows computer with MS Office. A Mac version is not available yet. This introductory workshop will cover the following topics: data im/export, interoperability with spreadsheet programs like Excel, table relationships, filters, queries, calculations, table joining, duplicate removal, reports and forms. Active Server Pages (ASP) for designing web interfaces will not be covered. Maximum number of participants: 12 
Mar 4, 2004  2:004:00pm  1007 Noel T. Keen Hall  Thomas Girke  UCR  Chemical Genetics Seminar Discussion of future screens and introduction into our compound database. 
Jan 7, 2004  1:003:00pm  1007 Noel T. Keen Hall  Azhar Alavi  Silicon Genetics  Introduction into GeneSpring This seminar provides a general introduction into GeneSpring, a data mining package for expression profiling. New features of the latest version 6.1 will be introduced which include new filtering tools and clustering techniques. 
Oct 30, 2003 Nov 20, 2003 
2:005:30pm  1007 Noel T. Keen Hall  Thomas Girke  UCR  GCG Basics (SeqWeb, SeqLab & Command Line) Manual & Exercises for this workshop These two workshops cover the basics on how to access GCG from SeqWeb, SeqLab and the command line. Since SeqLab is the most powerful and userfriendly GCG environment, the workshop focuses on this interface and gives an overview on the various sequence analysis tools which are available in GCG. This includes pattern searches, multiple alignments, phylogenetic trees, remote homology detection with HMMER, highthroughput sequence analysis such as BLAST searches in batch mode and how to make personal BLASTable sequence databases. Participants are encouraged to provide their own reallife problems during the exercise section. Maximum number of participants: 12 
Oct 2122, 2003  9:004:00pm  City of Hope  Bioinformaticians from NCBI  NCBI  NCBI Workshop Schedule 
Oct 23, 2003  9:004:00pm  City of Hope  Bioinformaticians from NCBI  NCBI  NCBI Minicourses Schedule 
Wed, July 16, 2003  2:005:00pm  1007 Noel T. Keen Hall  Thomas Girke  UCR  Phred/Phrap/Consed Manual for this workshop 
April 17 & May 1, 2003  3:005:30pm  1007 Noel T. Keen Hall  Thomas Girke  UCR  UNIX/LINUX Essentials for Beginners Manual for this workshop Topics: The Power of Unix, File System Organization, Getting Around, The Shell, Running Applications, Text Editors 
March 7, 2003  Seminar: 10:3012:00am Workshop: 1:004:00pm 
Seminar: Science Library, Rm 240 Workshop: Sproul Hall, Rm 2225 
Richard Hughey  UC Santa Cruz  An Introduction to Hidden Markov Models (seminar) Since their introduction to the biological sequence analysis community, profile hidden Markov models have become a standard for highperformance sequence search, classification, and alignment. These basic functions can also form core components of protein structure prediction and genome analysis. This tutorial presents an introduction to the process of creating and using profile hidden Markov models, followed by a discussion of the iterative search methods that enable particularly distant remote homology detection. The emphasis will be on gaining a qualitative understanding the underlying technology, with only a small amount of mathematics. The SAM HMM Software System (handson workshop) The Sequence Alignment and Modeling Software System (SAM) is the HMM system developed at UCSC in Haussler and Krogh's seminal work. It has been continuously improved since the introduction of profile HMMs, and now forms the core of our protein structure prediction efforts, and is used by many other research sites. In this workshop, we will use the SAM programs to create and examine HMMs, align sequences, and search databases. We will also use the SAM web servers for protein structure prediction. Attendees are encouraged to bring their own sequence or sequences of interest for building and using HMMs (web version). 
Dec 4, 2002  9am4pm  Watkins, Rm 2101  Mike Troutman  Affymerix  Affymetrix HandsOn Training (limited to 12 participants) This workshop will cover the Affymetrix image analysis tool MAS 5.0, the data mining software DMT and the database MicroDB. 
Nov 89, 2002  9am4pm  UCR Extension  Lukasz Jaroszewski & Dimitrios Morikis  UCSD & UCR  Protein Modeling Workshop: 1. Day: "Protein structure, domain databases, sequence and structure analysis" 2. Day: "Homology modeling and rational drug design" 
Oct 810, 2002  8am2pm  City of Hope  Bioinformaticians from NCBI  NCBI  NCBI Workshop (Summary): Oct 8th: "A Field Guide to GenBank and NCBI Molecular Biology Resources" Oct 9th: "Have a BLAST! A Practical Course on the Basic Local Alignment Search Tool (BLAST) from the NCBI" Oct 10th: "Making Sense of DNA and Protein Sequences" 
Oct 2, 2002  10am12pm  Science Library, Rm 240  Kyle O'Connor  Affymerix  A Demonstration of the Affymetrix Data Analysis Software 
July 24, 2002  9am3pm  City of Hope  Accelrys  Accelrys  Accelrys Biosequence Analysis Workshop DS Gene and SeqWeb (GCG) 
July 22, 2002  10am4pm  Sproul Hall, Rm 2225  InforMax  InforMax  Vector NTI Training Vector NTI database structure & organization, molecule reports, maps, features, primer design, back translation, vector design & construction, import, export, BioPlot, AlignX, Contig Express. 
May 29, 2002 May 22, 2002 
9am  12pm  Sproul Hall, Rm 2225  Thomas Girke  Center for Plant Cell Biology, UCR  GCG Workshop Manual for this workshop Covers the basics on how to access GCG from SeqWeb, SeqLab and the command line. Since SeqLab is the most powerful and userfriendly GCG environment, the workshop will focus on this application and give an overview on the various sequence analysis tools which are available in GCG. This will include highthroughput sequence analysis such as BLAST searches in batch mode and how to make personal BLASTable sequence databases. 
April 9, 2002  10am  12pm  Surge Bdg, Rm 284  Michael Gribscov  San Diego Supercomputer Center  Genomic analysis of plant protein kinases 