27.06.2013 Views

Assignment Sheet 1 Drug Design 2 Assignments

Assignment Sheet 1 Drug Design 2 Assignments

Assignment Sheet 1 Drug Design 2 Assignments

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Prof. Dr. Oliver Kohlbacher, Dr. Jens Krüger<br />

Charlotta Schärfe<br />

Fachbereich Informatik, Angewandte Bioinformatik (ABI)<br />

Zentrum für Bioinformatik Tübingen<br />

<strong>Assignment</strong>s<br />

<strong>Assignment</strong> <strong>Sheet</strong> 1<br />

for<br />

<strong>Drug</strong> <strong>Design</strong> 2<br />

Hand in electronically by 2 May 2013<br />

The dataset for this assignment (contained in A1.txt) contains experimental and theoretical (predicted) data on the log<br />

of the partition coefficient between octanol and water (log P, see also http://en.wikipedia.org/wiki/Partition_coefficient)<br />

for various drug-like molecules. The compounds themselves are available online from the ZINC database (zinc.docking.org)<br />

and can be accessed via their IDs (column ZINC ID in the table).<br />

We will use this dataset to explore some of the abilities of of KNIME to process and analyze simple datasets.<br />

A 0: KNIME Installation<br />

1. Visit the KNIME website: www.knime.org and http://tech.knime.org/getting-started<br />

2. Watch the screencasts provided at http://tech.knime.org/screencasts<br />

3. Install KNIME on your system (help available at http://tech.knime.org/installation-0<br />

and http://tech.knime.org/files/KNIME_quickstart.pdf)<br />

4. Open the application and briefly describe the use of the 7 different views you see in the window (about 7<br />

sentences)<br />

5. Create your own workspace and call it <strong>Drug</strong><strong>Design</strong>2<br />

6. Install the community nodes CDK and RDKit (described here: http://tech.knime.org/community)<br />

7. Create a new workflow group named “<strong>Assignment</strong> <strong>Sheet</strong> 1”<br />

A 1: First KNIME workflow [2 points]<br />

1. Create a new workflow in your newly created workflow group named “A1”<br />

2. Find a suitable I/O node that can read the file A1.txt found on<br />

http://kohlbacherlab.org/Teaching/SS13/DD2/<strong>Assignment</strong>s/assignment-sheet-1/<br />

3. Now build the workflow A1 so that it can read the file A1.txt, configure and run<br />

4. Briefly explain the data structure used by KNIME nodes for internal data storage (2-3 sentences)<br />

A 2: Data visualiziation and data table manipulation in KNIME [8 points]<br />

Copy workflow A1 to an new workflow A2 and extend this one to do the following:<br />

1. Rename column “abcd” to “logP”<br />

2. Remove/filter out all rows that don’t contain a numeric value in columns 2 or 3<br />

3. Calculate median and mean as well as the standard deviation for columns 2 and 3<br />

4. Plot the dataset


A 3: Column comparisons [6 points]<br />

Copy workflow A2 to an new workflow A3 and extend this one to do the following:<br />

1. Briefly describe what the logP is used for (2-3 sentences)<br />

2. Compare the experimental and computed logP values for each line and append a column to the table that<br />

holds the differece between column 2 and 3<br />

3. Look up the structure with the highest and the lowest difference between experimental and computed logP<br />

on the ZINC-Database zinc.docking.org. How do they look like and what is their name?<br />

A 4: Data export in KNIME [4 points]<br />

Copy workflow A2 to an new workflow A4 and extend this one to do the following:<br />

1. Get the workflow to export a pdf-report of the calculations done before (e.g., mean, median, plot)<br />

2. Export the final datatable to a csv-file with the corresponding node. Save file as A3.csv<br />

3. Export your workflows A1, A2 and A3 to an archive<br />

Please submit via e-mail the workflow archive, a pdf with your answers to A 0.4, A 1.4, A 3.1, and A 3.3, the pdfreport,<br />

and A3.csv bundled as <strong>Assignment</strong><strong>Sheet</strong>1_.tar.gz<br />

If you have questions about the tasks or the lecture, don’t hesitate to come by or email us!<br />

E-mail for questions and assignment submission (please include assignment number and your name into the subjectline):<br />

dd2-ss13@informatik.uni-tuebingen.de<br />

2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!