specless.dataset
Data and Dataset classes
Data Class
It’s basically a table. You can access its size:
>>> from specless.typing import Data
>>> demonstration = Data([['a', 1], ['b', 4], ['c', 6]], columns=['symbol', 'timestamp'])
>>> l = demonstration.size # Return the number of elements in this object
If it were a TimedTraceData object, it has a trace and timestamp data.
>>> symbols = demonstration["symbol"] # or demonstration.symbol
... # Returns a Series object
>>> timestamps = demonstration["timestamp"] # or demonstration.timestamp
... # Returns a Series object
or turn it into a list of tuples
>>> demonstration.values.tolist() # Returns a list of list
[['a', 1], ['b', 4], ['c', 6]]
You can sort the data
>>> sorted_demonstration = demonstration.sort_values(by="timestamp")
>>> demonstration.sort_values(by=["timestamp", "symbol"], inplace=True)
Dataset Class
A Data object can access a data (demonstration/trace) by:
>>> import specless as sl
>>> demonstrations = [
... [["e1",1], ["e2",2], ["e3",3], ["e4",4], ["e5",5]], # trace 1
... [["e1",1], ["e4",3], ["e2",5], ["e3",7], ["e5",9]], # trace 2
... [["e1",2], ["e2",4], ["e4",6], ["e3",8], ["e5",10]], # trace 3
... ]
>>> demonstrations = sl.ArrayDataset(demonstrations, columns=["symbol", "timestamp"])
>>> demonstration = demonstrations[0]
We can also return a list of data
>>> demonstrations.tolist()
[[['e1', 1], ['e2', 2], ['e3', 3], ['e4', 4], ['e5', 5]], [['e1', 1], ['e4', 3], ['e2', 5], ['e3', 7], ['e5', 9]], [['e1', 2], ['e2', 4], ['e4', 6], ['e3', 8], ['e5', 10]]]
>>> demonstrations.tolist(key="symbol")
[['e1', 'e2', 'e3', 'e4', 'e5'], ['e1', 'e4', 'e2', 'e3', 'e5'], ['e1', 'e2', 'e4', 'e3', 'e5']]
You can sort dataset in a batch
>>> f = lambda data: data.sort_values(by=["timestamp", "symbol"], inplace=True)
>>> demonstrations.apply(f)
Classes
Dataset class that contains a list of data. |
|
Base Dataset Class |
|
Reads a list of csv files and turns them into a dataset. |
|
Dataset class that contains a path to a file |