Introduction¶
Although the functions in fplib are more-or-less abstract, their development originated in the need to solve real problems. Specifically, their development was guided by solving problems in computational proteomics, particularly in the detection of peptides using mass spectrometry. However, as examples from this field would be too domain-specific, their functionality is illustrated on a more general use-cases.
Example¶
Let us start with showing some examples of function composition.
One of the functions that is commonly used within fplib is
dictf(), which simply transforms a dictionary into a function.Imagine having defined a few colors, and a color encoding dictionary.
>>> colors = ['red', 'green', 'blue'] >>> encd = {'red': 0, 'green': 1, 'blue': 2}We can encode the colors like this:
>>> encf = dictf(encd) # create the encoding function >>> encs = map(encf, colors) # map it over the colors >>> encs [0, 1, 2]Now suppose that the structure of the colors becomes more complex.
>>> lcolors = [['red', 'green'], ['green', 'blue']]If we want to encode and preserve the structure, there are multiple approaches we can take. We will show three of them:
1. Transforming the function into a function over lists
One option is to transform a function f over an element into a function g over a list of elements (which applies the function f over each element). For this purpose, we have another function
mapf().We can make it work over lists as follows:
>>> lencf = mapf(encf) # transform into function over lists >>> lencf(colors) # map over the list of colors [0, 1, 2]If we want to make it further work over lists-of-lists, we can do the following:
>>> llencf = mapf(lencf) # transform again >>> llencf(lcolors) [[0, 1], [1, 2]]Although the structure might look rather unlikely to occur, similar situations happen quite often when dealing with
pandas.DataFrames.Suppose the frame contains lists of colors in a column.
>>> import pandas as pd >>> df = pd.DataFrame({'colors': lcolors}) >>> df colors 0 [red, green] 1 [green, blue]We could simply encode them as follows:
>>> df['encs'] = df['colors'].apply(mapf(dictf(encd))) >>> df colors encs 0 [red, green] [0, 1] 1 [green, blue] [1, 2]Therefore, if we often apply functions, it is useful to be able to combine them in a compact manner.
Otherwise we would need to do something as follows:
>>> df['encs'] = df['colors'].apply(lambda l: [encd[e] for e in l])Although this example looks quite alright in both cases, we needed to use two purely auxiliary names (l and e) for the latter. For such a basic computational process, it might seem as too much, especially because what we want to express is rather straightforward.
2. Playing with structure
Another approach we could take with fplib is to flatten the structure of the lcolors list, apply the encoding and put the structure back.
For this, we can remove the outer structure with
unlist1().>>> ucolors = unlist1(lcolors) >>> ucolors ['red', 'green', 'green', 'blue']Now we can map directly:
>>> uencs = map(encf, ucolors) >>> uencs [0, 1, 1, 2]And finally, we can relist it back to the previous structure.
>>> lencs = relist1(uencs, lcolors) >>> lencs [[0, 1], [1, 2]]Note
This approach is useful if we can perform some calculation much faster over the linear structure, but we need to restructure the results.
3. Deep mapping
Yet another approach would be to map deeply over the listed structure using
deepmap().This approach, however, is of different semantics, and is here shown rather as an example.
>>> deepmap(encf, lcolors) [[0, 1], [1, 2]]Summary
The functions in fplib can sometimes make the code more compact and more readable for some people.