| Line 19: |
Line 19: |
| | | | |
| | :Only 92 senators listed? Sigh... This is a disappointing discovery. We had about five volunteers taking this project under their collective wing. Looks like we goofed. As a quick check, I'm looking at the Wikipedia edit history on Alaska senator [http://en.wikipedia.org/w/index.php?title=Lisa_Murkowski&diff=157265253&oldid=154626199 Lisa Murkowski], and it appears that there were no edits of the vandalism or falsehood variety. Data point of one, but I suspect the other 7 are also free of vandalism. Maybe, John, you would like to update us with a mini-spreadsheet of the missing 8 senators? Big apology for us missing/misreporting that data. -- [[User:MyWikiBiz|MyWikiBiz]] 10:57, 29 May 2009 (PDT) | | :Only 92 senators listed? Sigh... This is a disappointing discovery. We had about five volunteers taking this project under their collective wing. Looks like we goofed. As a quick check, I'm looking at the Wikipedia edit history on Alaska senator [http://en.wikipedia.org/w/index.php?title=Lisa_Murkowski&diff=157265253&oldid=154626199 Lisa Murkowski], and it appears that there were no edits of the vandalism or falsehood variety. Data point of one, but I suspect the other 7 are also free of vandalism. Maybe, John, you would like to update us with a mini-spreadsheet of the missing 8 senators? Big apology for us missing/misreporting that data. -- [[User:MyWikiBiz|MyWikiBiz]] 10:57, 29 May 2009 (PDT) |
| | + | |
| | + | ::In addition to Alaska, [http://en.wikipedia.org/w/index.php?title=Lisa_Murkowski&diff=157265253&oldid=154626199 Lisa Murkowski], the other seven are listed below. In all seven cases, based on edit summaries during the three months, the articles appeared to have had no vandalism during the period (I didn't check to see if the articles as of 1 October 2007 had some ongoing vandalism): |
| | + | ::*Maryland, [http://en.wikipedia.org/w/index.php?title=Ben_Cardin&diff=178837292&oldid=168200237 Ben Cardin] |
| | + | ::*Montana, [http://en.wikipedia.org/w/index.php?title=Jon_Tester&diff=260108186&oldid=242187084 Jon Tester] |
| | + | ::*Nebraska, [http://en.wikipedia.org/w/index.php?title=Ben_Nelson&diff=258051393&oldid=243906310 Ben Nelson] |
| | + | ::*Nevada, [http://en.wikipedia.org/w/index.php?title=John_Ensign&diff=180665258&oldid=163610701 John Ensign] |
| | + | ::*New Jersey, [http://en.wikipedia.org/w/index.php?title=Frank_Lautenberg&diff=180619195&oldid=162723305 Frank Lautenberg] |
| | + | ::*New Mexico, [http://en.wikipedia.org/w/index.php?title=Jeff_Bingaman&diff=178836952&oldid=162118619 Jeff Bingaman] |
| | + | ::*North Carolina, [http://en.wikipedia.org/w/index.php?title=Richard_Burr&diff=181142828&oldid=165499016 Richard Burr] |
| | + | :: I'm going to pass on providing a mini-spreadsheet. That would require (among other things) totaling the number of page views during the period, which I don't have time for. -- [[User:John Broughton|John Broughton]] 09:05, 8 July 2009 (PDT) |
| | | | |
| | === Overlapping damage === | | === Overlapping damage === |
| Line 28: |
Line 38: |
| | | | |
| | :Drat! I even remember trying to calculate properly Ted Stevens' "kinky sex adventures" versus the clarification that these adventures took place "with donkeys". I intended to be fair, but it looks like in my copying and pasting between Wikipedia and Google spreadsheet, I didn't "cut off" the plain old "kinky sex adventures" at 23:50, 11 November, as I should have. Your 10% vs. 90% explanation is clear to me, but I hope that you'll understand that because these situations were relatively rare, calculating the way I did, it relieved me of an even more undue burden of trying to re-calculate "error view chances" based on layered vs. unlayered errors. If you have a $10,000 federal grant to support a re-calculation, I'll be happy to "volunteer" again! -- [[User:MyWikiBiz|MyWikiBiz]] 10:57, 29 May 2009 (PDT) | | :Drat! I even remember trying to calculate properly Ted Stevens' "kinky sex adventures" versus the clarification that these adventures took place "with donkeys". I intended to be fair, but it looks like in my copying and pasting between Wikipedia and Google spreadsheet, I didn't "cut off" the plain old "kinky sex adventures" at 23:50, 11 November, as I should have. Your 10% vs. 90% explanation is clear to me, but I hope that you'll understand that because these situations were relatively rare, calculating the way I did, it relieved me of an even more undue burden of trying to re-calculate "error view chances" based on layered vs. unlayered errors. If you have a $10,000 federal grant to support a re-calculation, I'll be happy to "volunteer" again! -- [[User:MyWikiBiz|MyWikiBiz]] 10:57, 29 May 2009 (PDT) |
| | + | :: I don't think it would take a research grant to fix this, just some spreadsheet formulas. Basically, for a given row, if the ending date/time of the vandalism is less than the value of that same field in the prior row, then don't count '''any''' views as being damaged. (To be even more accurate, you could check the current row against more rows: say, the row that is two lines above, and the row that is three lines above.) |
| | + | |
| | + | ::Similarly, you could flag, for manual inspection, using a spreadsheet formula, cases where the starting date/time of a row was less than the ending date/time of the row(s) above it; you could even tell the spreadsheet to ignore rows where the views in question are less than 10 (or so), as being immaterial. -- [[User:John Broughton|John Broughton]] 09:14, 8 July 2009 (PDT) |
| | | | |
| | === Meaningless "damaged article-minutes" calculation === | | === Meaningless "damaged article-minutes" calculation === |
| Line 65: |
Line 78: |
| | | | |
| | Lastly, the use of User:Henrik's traffic tool was a bit of a stretch. Assuming that Henrik wrote the traffic-monitoring script correctly, it seemed to be reliable enough. The counts seemed to pass the reality check, too. Barack Obama got far more views than John Barrasso. However, note that another flaw with our calculations was that the study spanned the Fourth Quarter of 2007, but we only took one month of Henrik's traffic data (January 2008) to extrapolate for the previous 90 days. We were sort of forced into this, because Henrik's tool only came into being on December 10, 2007, so the closest "complete" month of traffic data was January 2008. Of course, 2008 was an election year, so that certainly had some skewing effect on candidates up for re-election, versus those who were not running in any race. -- [[User:MyWikiBiz|MyWikiBiz]] 10:57, 29 May 2009 (PDT) | | Lastly, the use of User:Henrik's traffic tool was a bit of a stretch. Assuming that Henrik wrote the traffic-monitoring script correctly, it seemed to be reliable enough. The counts seemed to pass the reality check, too. Barack Obama got far more views than John Barrasso. However, note that another flaw with our calculations was that the study spanned the Fourth Quarter of 2007, but we only took one month of Henrik's traffic data (January 2008) to extrapolate for the previous 90 days. We were sort of forced into this, because Henrik's tool only came into being on December 10, 2007, so the closest "complete" month of traffic data was January 2008. Of course, 2008 was an election year, so that certainly had some skewing effect on candidates up for re-election, versus those who were not running in any race. -- [[User:MyWikiBiz|MyWikiBiz]] 10:57, 29 May 2009 (PDT) |
| | + | |
| | + | :Ack! I forgot to even mention that the research team had collected an even larger tally of "errors" in the articles about the 100 senators than what was posted on Google Docs; however, before publication of the database, I personally removed a substantial number of "rows" in the database that constituted more ''minor'' typographical or other grammatical edits that simply didn't offend the sensibilities. If I recall, I removed about 30 or 40 such rows. -- [[User:MyWikiBiz|MyWikiBiz]] 11:37, 29 May 2009 (PDT) |