10.11.2016 Views

Learning Data Mining with Python

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 6<br />

The code is as follows:<br />

actual_labels = []<br />

label_mapping = dict(tweet_ids)<br />

Next, we are going to create a twitter server to collect all of these tweets.<br />

This is going to take a little longer. Import the twitter library that we used before,<br />

creating an authorization token and using that to create the twitter object:<br />

import twitter<br />

consumer_key = ""<br />

consumer_secret = ""<br />

access_token = ""<br />

access_token_secret = ""<br />

authorization = twitter.OAuth(access_token, access_token_secret,<br />

consumer_key, consumer_secret)<br />

t = twitter.Twitter(auth=authorization)<br />

Iterate over each of the twitter IDs by extracting the IDs into a list using the<br />

following command:<br />

all_ids = [tweet_id for tweet_id, label in tweet_ids]<br />

Then, we open our output file to save the tweets:<br />

<strong>with</strong> open(tweets_filename, 'a') as output_file:<br />

The Twitter API allows us get 100 tweets at a time. Therefore, we iterate over each<br />

batch of 100 tweets:<br />

for start_index in range(0, len(tweet_ids), 100):<br />

To search by ID, we first create a string that joins all of the IDs (in this batch)<br />

together:<br />

id_string = ",".join(str(i) for i in<br />

all_ids[start_index:start_index+100])<br />

Next, we perform a statuses/lookup API call, which is defined by Twitter.<br />

We pass our list of IDs (which we turned into a string) into the API call in order<br />

to have those tweets returned to us:<br />

search_results = t.statuses.lookup(_id=id_string)<br />

[ 117 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!