Pickling and Shelves - Department of Computer Science
Pickling and Shelves - Department of Computer Science
Pickling and Shelves - Department of Computer Science
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Pickling</strong> <strong>and</strong> <strong>Shelves</strong><br />
Dan Fleck<br />
Coming up: What is pickling?
What is pickling?<br />
<strong>Pickling</strong> - is the process <strong>of</strong> preserving food by<br />
anaerobic fermentation in brine (a solution <strong>of</strong> salt in<br />
water), to produce lactic acid, or marinating <strong>and</strong><br />
storing it in an acid solution, usually vinegar (acetic<br />
acid). The resulting food is called a pickle. This<br />
procedure gives the food a salty or sour taste.<br />
– Wikipeadia<br />
Do we really need to know that?<br />
No… but pickling is done to preserve food for later eating.<br />
<strong>Pickling</strong> in Python is done to preserve data for later usage.<br />
(And without the salty or sour taste! ! )
<strong>Pickling</strong> in Python<br />
•!The pickle module implements an algorithm for<br />
serializing <strong>and</strong> de-serializing a Python object<br />
structure.<br />
•!‘<strong>Pickling</strong>’ is the process whereby a Python object<br />
hierarchy is converted into a byte stream<br />
•!‘Unpickling’ is the inverse operation, whereby a byte<br />
stream is converted back into an object hierarchy.<br />
•!Alternatively known as serialization or marshalling, to<br />
avoid confusion, the terms used here are pickling <strong>and</strong><br />
unpickling. – Python Docs
Simply Put<br />
<strong>Pickling</strong> is a way to store <strong>and</strong> retrieve data<br />
variables into <strong>and</strong> out from files.<br />
Data variables can be lists, ints, classes,<br />
etc… There are some rules, but mostly<br />
these all work.
How does it work Pickle-Master?<br />
To pickle something you must:<br />
import pickle<br />
Then to write a variable to a file:<br />
pickle.dump(myString, outfile)<br />
Always remember, Pickle-chips, that the outfile needs to<br />
be opened first!<br />
This will store your data into a file<br />
It works with lists <strong>and</strong> objects also!!
Unpickling<br />
To un-pickle something (read the file back into a variable)<br />
you must:<br />
import pickle<br />
Then to write a variable to a file:<br />
myString = pickle.load(inputfile)<br />
Always remember, Pickle-chips, that the inputfile needs to<br />
be opened first!<br />
This will recreate the myString data from the file!
Example<br />
import pickle<br />
def pickler(name1, name2):<br />
output = open('pickle_jar', 'w')<br />
pickle.dump(name1, output)<br />
pickle.dump(name2, output)<br />
output.close()<br />
def unpickler():<br />
input = open('pickle_jar', 'r')<br />
name1 = pickle.load(input)<br />
name2 = pickle.load(input)<br />
input.close()<br />
return name1, name2<br />
def main():<br />
nm1 = 'Vlassic'<br />
nm2 = 'Heinz'<br />
pickler(nm1, nm2)<br />
nm1, nm2 = unpickler()<br />
print 'NM1 is %s, NM2 is %s' %(nm1,<br />
nm2)<br />
main()<br />
>>> Output: NM1 is Vlassic, NM2 is<br />
Heinz<br />
Notice: You can add multiple variables into a<br />
pickle file, but you must read <strong>and</strong> write them in<br />
the exact same order!
•! See example<br />
Lets try it!<br />
•! Can we read in the variables from the<br />
list_pickler.py file?
Example with a list<br />
import pickle<br />
def pickler(name1, lst):<br />
output = open('pickle_jar', 'w')<br />
pickle.dump(name1, output)<br />
pickle.dump(lst, output)<br />
output.close()<br />
def unpickler():<br />
input = open('pickle_jar', 'r')<br />
name1 = pickle.load(input)<br />
myList = pickle.load(input)<br />
input.close()<br />
return name1, myList<br />
def main():<br />
nm1 = 'Vlassic’<br />
types = ['sweet', 'hot', 'dill', 99]<br />
pickler(nm1, types)<br />
nm1, types = unpickler()<br />
print 'NM1 is %s, types is %s' %(nm1,<br />
types)<br />
main()<br />
>>> NM1 is Vlassic, types is ['sweet',<br />
'hot', 'dill', 99]
<strong>Pickling</strong> is great… but…<br />
•! Two problems with pickling:<br />
–!Pickle files are not backward compatible<br />
•! The format <strong>of</strong> the pickle file is not defined by you, so if<br />
you change your data (for example, you want to store an<br />
additional variable…. you must create new pickle file, all<br />
existing pickle files will be broken<br />
–!Pickle files must be accessed in sequential order<br />
•! You cannot say “give me the 3 rd variable stored in the file”<br />
•! The solution to this problem is ‘shelves’. I can store many<br />
pickle jars on a shelf! !
<strong>Shelves</strong><br />
•! <strong>Shelves</strong> let you store <strong>and</strong> retreive variables<br />
in any order using pickling.<br />
•! <strong>Shelves</strong> need a name to get the<br />
data. So, it works like a dictionary…<br />
given a name, lookup the data<br />
In the pickle<br />
dictionary,<br />
under the name<br />
“shape” store<br />
this info!<br />
pickles = shelve.open(‘pickle_jar’, writeback=True)<br />
pickles[“variety”] = [“sweet”, “hot”, “dill”]<br />
pickles[“shape”] = [“whole”,”spear”,”chip”]<br />
pickles.sync() # make sure data is stored to the file<br />
pickles.close()
<strong>Shelves</strong> Example<br />
import pickle<br />
import shelve<br />
Cool!<br />
pickles = shelve.open('pickle_jar.dat’, writeback=True)<br />
pickles['variety'] = ['sweet', 'hot', 'dill']<br />
pickles['shape'] = ['whole','spear','chip']<br />
pickles.sync()<br />
pickles.close()<br />
pickles = shelve.open('pickle_jar.dat')<br />
shapes = pickles['shape']<br />
pickles.close()<br />
print 'SHAPES:', shapes<br />
In the pickle<br />
dictionary,<br />
lookup the data<br />
in under<br />
“shape”<br />
>>> SHAPES: ['whole', 'spear', 'chip']
What is pickling really doing?<br />
•! The real power in the pickling is converting<br />
data into a formatted string automatically.<br />
•! This is what you did manaully converting<br />
things into:<br />
–!john~smith~G001212~12~124<br />
•! <strong>Pickling</strong> just st<strong>and</strong>ardizes this for Python
Why?<br />
•! The most common use for a string<br />
representation <strong>of</strong> a data structure is to<br />
store <strong>and</strong> retrieve it in files<br />
•! However you can also convert it to a<br />
string <strong>and</strong> store it in a database or send<br />
it across a network<br />
•! All <strong>of</strong> these help support persistance
Persistence<br />
•! In computer science, persistence refers to the<br />
characteristic <strong>of</strong> data that outlives the execution<br />
<strong>of</strong> the program that created it. Without this<br />
capability, data only exists in memory, <strong>and</strong> will<br />
be lost when the memory loses power, such as<br />
on computer shutdown. – Wikipedia<br />
•! So, pickling supports persistence by allowing<br />
you to easily write data to a persistent data<br />
store (e.g. a file, or database)
Why is Python so cool?<br />
•! Because it uses fun words like pickling.<br />
•! Other languages do the same thing as<br />
pickling but call it<br />
–!object serialization <strong>and</strong> deserialization<br />
–!marshalling <strong>and</strong> unmarshalling
Summary<br />
•! <strong>Pickling</strong> <strong>and</strong> unpickling let you store/retreive<br />
data (lists, int, string, objects, etc..) from files.<br />
This is helps you persist data.<br />
•! This is called serialization or marshaling in<br />
other languages (Java for example)<br />
•! <strong>Shelves</strong> let you store <strong>and</strong> retrive pickled data<br />
in any order (they allow r<strong>and</strong>om access – any<br />
data can be read/written at any time)