wiki:BrooklineVote

Version 16 (modified by Eric Hazen, 6 years ago) (diff)

--

Bulk E-mailing

Did this on my home machine and it seems to work:

Spreadsheet Editing

Various ways of reading Excel spreadsheets in Perl. Best one seems to be Spreadsheet::Read. This seems to work well on .xls (open and Save As in Excel 2010) files but not so well on the supplied .xlsx.

HOWEVER the utility xlscat supplied with Spreadsheet::Read stops after 214 rows.

Data and tools SVN: http://edf.bu.edu/svn/edf/EricHome/BrooklineVote

Database Work

2018-03-25

Prototype working as r3070 in SVN "online" folder.

Change requests from Sara:

  • Add new column "Comment" (text)
  • Add new column "ID" (text)
  • Add new column "EMail" (text)
  • Add existing column "ocupation" (sic)
  • Add new column "ElectedOfficial" (text) with name "ELO" or something
  • Change HouseNo column title in table to "No" [done]
  • Rationalize form layout [done]
  • Sort by up to 4 columns [done]
  • Display query row count
  • Date of Birth displayed as age in table [done]
  • Create an optimized print view
  • Change column display order [done]:

ID, Precinct, No, Street, Apt, First, Last, Phone, EMail, Age, Sex, Occup., 15, 16, 17, Comment

2018-03-10

Wrote script import_csv.pl to import CSV to SQLite db. Seems to work but dates are not handled correctly. Now all data is in voters.db. Dump to sql using sqlite3 CLI .dump. Wrote perl script to re-format dates (fix_dates.pl) and create fixed DB voters_new.db).

Can't figure out how to extract year from date in SQLite except with:

substr(date(birthdate),1,4)).

This code will make a summary of ages by 5-year groups:

SELECT count(ROWID),                                                                                                                                
  (2018 - cast( substr( date( birthdate),1,4) as int))/5*5 as age                                                                                   
  from frequent                                                                                                                                     
  group by age                                                                                                                                      
  order by age;                                                                                                                                     

For frequent voters, this produces:

10,15
227,20
251,25
342,30
569,35
1000,40
1346,45
1314,50
1249,55
1342,60
1453,65
1486,70
1027,75
616,80
386,85
244,90
66,95
15,100

Thinking about apartment buildings. What we want is a query to generate a list of all unique addresses with more than n distinct apartment numbers. This seems promising:

select streetno, streetname, count(distinct aptno) 
from voters 
where aptno != '' 
group by streetname, streetno 
order by streetname, streetno;

Here is a version where we use a sub-select to filter the results to > 10 apartments per address:

SELECT sub.*                                                                                                                                             
  FROM(                                                                                                                                                  
    SELECT streetno, streetname, count(distinct aptno) AS napt                                                                                           
    FROM voters                                                                                                                                          
    WHERE aptno != ''                                                                                                                                    
    GROUP BY streetname, streetno                                                                                                                        
    ORDER BY napt desc                                                                                                                                   
  ) sub                                                                                                                                                  
  WHERE sub.napt > 10;

This is not very accurate because not all apartments have registered voters in them. In fact, it would be useful to know what fraction of each large multi-unit actually have frequent voters. For this I think we need to pull in the property database.