Copious Free Time: A Project Bunny In My Head
I present a data table, showing the first ten Presidents of the United States.
Order | Presidency | President | Party |
---|---|---|---|
1 | April 30, 1789 – March 4, 1797 | George Washington | Unaffiliated |
2 | March 4, 1797 – March 4, 1801 | John Adams | Federalist |
3 | March 4, 1801 – March 4, 1809 | Thomas Jefferson | Democratic-Republican |
4 | March 4, 1809 – March 4, 1817 | James Madison | Democratic- Republican |
5 | March 4, 1817 – March 4, 1825 | James Monroe | Democratic-Republican |
6 | March 4, 1825 – March 4, 1829 | John Quincy Adams | Democratic- Republican |
7 | March 4, 1829 – March 4, 1837 | Andrew Jackson | Democratic |
8 | March 4, 1837 – March 4, 1841 | Martin Van Buren | Democratic |
9 | March 4, 1841 – April 4, 1841 | William Henry Harrison | Whig |
10 | April 4, 1841 – March 4, 1845 | John Tyler | Whig |
Every row is related because it’s all about the same person. Every column is related because it’s all the same kind of data. If we were to add the other 35, and add ages, ages at inauguration and other data we might want to play with, and we have something we can hang statistics off of.
A data frame, we might say.
When I play with R, this is one of the things that hangs me up. I want to extract the terms count
array so I can tell what percentage of presidents had two full terms or something, and I don’t know how to address it. That’s an issue between me and R, though. But this is conceptually a powerful thing, because we know what the columns are, while with the closest analog in Perl, the multidimensional array, we just know the index. It’s just $presidents[$i][3]
not $presidents[$i][party]
.
But that’s R stuff. I usually blog Perl stuff. What’s going on here?
Point 1: FFI::Platypus
FFI::Platypus is a library for creating interfaces to machine code libraries. I have little desire to do these things myself, but on occasion, I see a thing that makes me think about playing with such things. Like Travis Gibson connecting to libusb
to write a device driver in Perl. (I actually forget if he said he used FFI::Platypus or XS to interact with the library.)
There’s usually a time, mid-summer, when I get the idea of doing something fairly big and special, pulling in FFI or the like. I wonder why that could be?
Point 2: Feather
“Feather is a fast, lightweight, and easy-to-use binary file format for storing data frames.” It works for Python as well. I figure, just because I’m not sure what we’d handle this sort of data structure in Perl, but it certainly we should be able to open the things. I mean, how are we supposed to be able to glue everything if we can’t?
Aside: Arrow
it’s been a while since I looked at this project, and it seems to have been pulled into Apache’s Arrow platform. That just changes where we might go to get the library, not how we’d use it, I think.
Thesis: It Should Be Doable To Introduce Feather to Perl
The two parts, an object file and the FFI code to interact with it, could be done. It’s a Small Matter of Programming, one might say. Or maybe a Learning Experience that could be done in my Copious Free Time.
Yes, the previous paragraph drips with sarcasm, but Perl handling data frames seems reasonable, and FFI is the kind of thing that could level up my game.
Counterpoint: R is not my problem
Nothing I touch these days has R on it. When I touched R, I did it shallowly, anyways. When I wanted to transfer data to R from Perl previously, MySQL was the intervening tool.
So this is not remotely my itch. It is however, an idea that has hopped into my head, and I think trying it is the only way I can get it out.
But maybe, just maybe, the people who might actually use this idea would now see this through, so I don’t have to.