Version 16 (modified by 6 years ago) (diff) | ,
---|
Bulk E-mailing
Did this on my home machine and it seems to work:
Spreadsheet Editing
Various ways of reading Excel spreadsheets in Perl. Best one seems to be Spreadsheet::Read. This seems to work well on .xls (open and Save As in Excel 2010) files but not so well on the supplied .xlsx.
HOWEVER the utility xlscat
supplied with Spreadsheet::Read stops after 214 rows.
Data and tools SVN: http://edf.bu.edu/svn/edf/EricHome/BrooklineVote
Database Work
2018-03-25
Prototype working as r3070 in SVN "online" folder.
Change requests from Sara:
- Add new column "Comment" (text)
- Add new column "ID" (text)
- Add new column "EMail" (text)
- Add existing column "ocupation" (sic)
- Add new column "ElectedOfficial" (text) with name "ELO" or something
- Change HouseNo column title in table to "No" [done]
- Rationalize form layout [done]
- Sort by up to 4 columns [done]
- Display query row count
- Date of Birth displayed as age in table [done]
- Create an optimized print view
- Change column display order [done]:
ID, Precinct, No, Street, Apt, First, Last, Phone, EMail, Age, Sex, Occup., 15, 16, 17, Comment
2018-03-10
Wrote script import_csv.pl to import CSV to SQLite db. Seems to work but dates are not handled correctly.
Now all data is in voters.db
. Dump to sql using sqlite3 CLI .dump
. Wrote perl script to re-format dates (fix_dates.pl
) and create fixed DB voters_new.db
).
Can't figure out how to extract year from date in SQLite except with:
substr(date(birthdate),1,4))
.
This code will make a summary of ages by 5-year groups:
SELECT count(ROWID), (2018 - cast( substr( date( birthdate),1,4) as int))/5*5 as age from frequent group by age order by age;
For frequent voters, this produces:
10,15 227,20 251,25 342,30 569,35 1000,40 1346,45 1314,50 1249,55 1342,60 1453,65 1486,70 1027,75 616,80 386,85 244,90 66,95 15,100
Thinking about apartment buildings. What we want is a query to generate a list of all unique addresses with more than n
distinct apartment numbers. This seems promising:
select streetno, streetname, count(distinct aptno) from voters where aptno != '' group by streetname, streetno order by streetname, streetno;
Here is a version where we use a sub-select to filter the results to > 10 apartments per address:
SELECT sub.* FROM( SELECT streetno, streetname, count(distinct aptno) AS napt FROM voters WHERE aptno != '' GROUP BY streetname, streetno ORDER BY napt desc ) sub WHERE sub.napt > 10;
This is not very accurate because not all apartments have registered voters in them. In fact, it would be useful to know what fraction of each large multi-unit actually have frequent voters. For this I think we need to pull in the property database.