27.06.2013 Views

Assignment Sheet 1 Drug Design 2 Assignments

Assignment Sheet 1 Drug Design 2 Assignments

Assignment Sheet 1 Drug Design 2 Assignments

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Prof. Dr. Oliver Kohlbacher, Dr. Jens Krüger<br />

Charlotta Schärfe<br />

Fachbereich Informatik, Angewandte Bioinformatik (ABI)<br />

Zentrum für Bioinformatik Tübingen<br />

<strong>Assignment</strong>s<br />

<strong>Assignment</strong> <strong>Sheet</strong> 1<br />

for<br />

<strong>Drug</strong> <strong>Design</strong> 2<br />

Hand in electronically by 2 May 2013<br />

The dataset for this assignment (contained in A1.txt) contains experimental and theoretical (predicted) data on the log<br />

of the partition coefficient between octanol and water (log P, see also http://en.wikipedia.org/wiki/Partition_coefficient)<br />

for various drug-like molecules. The compounds themselves are available online from the ZINC database (zinc.docking.org)<br />

and can be accessed via their IDs (column ZINC ID in the table).<br />

We will use this dataset to explore some of the abilities of of KNIME to process and analyze simple datasets.<br />

A 0: KNIME Installation<br />

1. Visit the KNIME website: www.knime.org and http://tech.knime.org/getting-started<br />

2. Watch the screencasts provided at http://tech.knime.org/screencasts<br />

3. Install KNIME on your system (help available at http://tech.knime.org/installation-0<br />

and http://tech.knime.org/files/KNIME_quickstart.pdf)<br />

4. Open the application and briefly describe the use of the 7 different views you see in the window (about 7<br />

sentences)<br />

5. Create your own workspace and call it <strong>Drug</strong><strong>Design</strong>2<br />

6. Install the community nodes CDK and RDKit (described here: http://tech.knime.org/community)<br />

7. Create a new workflow group named “<strong>Assignment</strong> <strong>Sheet</strong> 1”<br />

A 1: First KNIME workflow [2 points]<br />

1. Create a new workflow in your newly created workflow group named “A1”<br />

2. Find a suitable I/O node that can read the file A1.txt found on<br />

http://kohlbacherlab.org/Teaching/SS13/DD2/<strong>Assignment</strong>s/assignment-sheet-1/<br />

3. Now build the workflow A1 so that it can read the file A1.txt, configure and run<br />

4. Briefly explain the data structure used by KNIME nodes for internal data storage (2-3 sentences)<br />

A 2: Data visualiziation and data table manipulation in KNIME [8 points]<br />

Copy workflow A1 to an new workflow A2 and extend this one to do the following:<br />

1. Rename column “abcd” to “logP”<br />

2. Remove/filter out all rows that don’t contain a numeric value in columns 2 or 3<br />

3. Calculate median and mean as well as the standard deviation for columns 2 and 3<br />

4. Plot the dataset


A 3: Column comparisons [6 points]<br />

Copy workflow A2 to an new workflow A3 and extend this one to do the following:<br />

1. Briefly describe what the logP is used for (2-3 sentences)<br />

2. Compare the experimental and computed logP values for each line and append a column to the table that<br />

holds the differece between column 2 and 3<br />

3. Look up the structure with the highest and the lowest difference between experimental and computed logP<br />

on the ZINC-Database zinc.docking.org. How do they look like and what is their name?<br />

A 4: Data export in KNIME [4 points]<br />

Copy workflow A2 to an new workflow A4 and extend this one to do the following:<br />

1. Get the workflow to export a pdf-report of the calculations done before (e.g., mean, median, plot)<br />

2. Export the final datatable to a csv-file with the corresponding node. Save file as A3.csv<br />

3. Export your workflows A1, A2 and A3 to an archive<br />

Please submit via e-mail the workflow archive, a pdf with your answers to A 0.4, A 1.4, A 3.1, and A 3.3, the pdfreport,<br />

and A3.csv bundled as <strong>Assignment</strong><strong>Sheet</strong>1_.tar.gz<br />

If you have questions about the tasks or the lecture, don’t hesitate to come by or email us!<br />

E-mail for questions and assignment submission (please include assignment number and your name into the subjectline):<br />

dd2-ss13@informatik.uni-tuebingen.de<br />

2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!