User:Badmachine/wikimedia-research-2014-08-26
MyWikiBiz, Author Your Legacy — Tuesday November 12, 2024
Jump to navigationJump to search[14:33:44] <halfak> Hey Ironholds [14:33:49] <halfak> uber analytics meeting! [15:26:26] <Ironholds> R protip of the day: magrittr makes your code 20 times more readable. [15:32:22] <DarTar> hey halfak, let me grab some coffe, brb [15:32:26] <milimetric> Ironholds / DarTar: tnegrin suggested that we sync up a bit about the mobile dashboard stuff [15:32:28] <halfak> Hokay [15:32:34] <Ironholds> milimetric, okie-dokes! [15:32:36] * halfak runs to get a candy bar [15:32:48] <Ironholds> just lemme know when so I can ty and get on a decent connection [15:32:57] * Ironholds may go over to the backup hacker collective for that [15:33:13] <milimetric> Ironholds: after lunch east coast? So like 13:00 EST? [15:33:30] <milimetric> lemme look at your calendars [15:33:47] <Ironholds> oh yeah we're in the same timezone now! [15:33:49] <Ironholds> hi from the north! [15:34:30] <milimetric> :) time looked clear so I sent the invite [15:34:38] <milimetric> feel free to reject - I'm gonna go get some food [15:35:08] <Ironholds> cool! [15:35:10] <Ironholds> ditto [16:03:50] <DarTar> hey leila [16:04:01] <DarTar> for some reason I can’t respond via DM [16:04:09] <DarTar> IRC tells me you’re not online [16:04:23] <leila> mmm [16:04:27] <leila> can you see here? [16:04:27] <DarTar> anyway, if you read me, feel free to respond to that thread [16:04:29] <DarTar> yes [16:04:39] <leila> okay. I'll respond. thanks! [16:05:01] <leila> (and I'll try to join for StandUp, if the connection cooperates. It's lunch time here, so I may be free.) [16:09:31] <DarTar> leila: sounds good [16:28:19] <halfak> DarTar: I've got 5 minutes if you do. [16:32:21] <leila> DarTar, halfak, are you in Hangout? [16:32:30] <halfak> Will be in a minute. [16:32:33] <leila> got it [16:32:37] <DarTar> coming [16:33:59] <DarTar> Ironholds: yt? [16:34:03] <DarTar> standup [17:37:48] <Ironholds> darnit [17:37:58] <Ironholds> I have an R problem in a class Leila is fricking great at and she's away :( [17:58:14] <J-Mo> ping yuvipanda [17:58:42] * yuvipanda pings J-Mo [17:59:09] <J-Mo> :) how close are you to implementing CSV download in Quarry? I was hoping to use that functionality in my webinar tomorrow, if it was available. [17:59:18] <yuvipanda> 'sup [17:59:50] <J-Mo> if not, I can use Wikimetrics instead. [18:00:34] <yuvipanda> J-Mo: oh, can do by tomorrow, sure [18:00:36] <J-Mo> basically, I want to guide users through the process of grabbing a CSV dataset from the slaveDB, and manipulating those data with Python, + adding in some related data from the API [18:00:38] <yuvipanda> J-Mo: let me hook it up [18:00:50] <yuvipanda> should be done in a couple of hours [18:01:01] <milimetric> yuvipanda: let me know if you get stuck on anything with CSV downloads [18:01:04] <J-Mo> that okay? I don't want to throw an emergency deadline at you! [18:01:06] <milimetric> I can help [18:01:24] * J-Mo thanks milimetric and yuvipanda [18:01:25] <yuvipanda> J-Mo: nah, it's trivial. [18:01:35] <yuvipanda> milimetric: \o/ will poke if needed [18:02:06] <yuvipanda> J-Mo: there's a recurring bug I can't track down yet tho - if a query is 'queued' for more than 10s, please ask people to hit 'submit' again [18:02:10] <J-Mo> cool cool! ping me if you need my help (for some reason), or if you have questions, etc [18:02:21] <J-Mo> will do, Yuvi [18:03:03] <yuvipanda> J-Mo: cool :) also note that with python's default csv module, it barfs on Unicode CSVs, and you need to install the unicodecsv module [18:03:15] <yuvipanda> that'll trip people up if they're not using englishwiki for stats [18:05:01] <J-Mo> good catch. I'll work with that module (I'm making test scripts for people to manipulate, rather than having them write Python from scratch). [18:05:23] <yuvipanda> cool [18:44:17] <DarTar> Ironholds, yt? [18:44:43] <Ironholds> DarTar, yep! [18:45:01] <DarTar> so I had a thought about the referral stuff while I was in the shower [18:45:07] <DarTar> (be very afraid) [18:46:20] <DarTar> in the context of referred traffic we’ve been talking about PVs and deprioritized UC/UV because the sampled data is not the best source to answer that question [18:46:31] <Ironholds> DarTar, yup [18:46:33] <DarTar> but I thought, what about unique articles? [18:46:38] * Ironholds thinks [18:47:02] <Ironholds> that would be programmatically problematic to implement as part of the same dataset. But I could do it alongside. [18:47:04] <DarTar> I think it would be fascinating to determine the breadth of traffic we get from referred vs organic traffic [18:47:21] <Ironholds> I could even build it out as an inequality coefficient or something [18:47:22] <DarTar> yeah, so it’s not as high a priority as simple PV counts [18:47:35] <Ironholds> "here is the coefficient for referred traffic over time, here is the coefficient for organic" [18:47:42] <Ironholds> I can make it into a nice little animated visualisation [18:47:47] <DarTar> yeah, so I was imagining that probably some segments of traffic by referral are really focused on a small subset of articles [18:47:49] <Ironholds> I've wanted an excuse to play with ggvis for a while. [18:47:56] <DarTar> oh god, I take that back [18:47:59] <DarTar> :p [18:48:07] <Ironholds> what? [18:48:14] * DarTar kidding [18:48:15] <Ironholds> y u no liek animated gifs? [18:48:37] <DarTar> I just read in the changelog for the latest Dropbox iOS app: [18:48:51] <DarTar> “better support for high resolution animated GIFs†[18:48:56] <DarTar> WTF [18:49:38] <DarTar> anyway, what do you think about unique articles as a secondary metric to look into on a longitudinal basis? I think we would discover a lot of interesting things [18:50:11] <Ironholds> sure. Do you want the specific articles, or the number, or the inequality? [18:50:21] <Ironholds> like I said, I could have it save a list of coordinates for coefficients. [18:50:54] <DarTar> brb [18:59:06] <DarTar> oh sorry, by longitudinal I meant “historicalâ€, not by geography [18:59:40] <DarTar> (unless I misunderstand what you mean by coordinates) [19:00:11] <DarTar> hey, I have to find Abbey for lunch, bbl [19:15:03] <Ironholds> DarTar, nono, I know ;p [19:15:10] <Ironholds> I meant coordinates in the sense of coordinate plotting [19:15:25] <Ironholds> i.e., I give you a list, each element of which is a set of coordinates R can interpret as forming a geni coefficient in plot() [22:34:54] <DarTar> yuvipanda: you should really use my “fleur de yuvi†screenshot as your Trello avatar [22:35:02] <yuvipanda> DarTar: :D [22:35:05] <yuvipanda> DarTar: I could [22:35:29] <DarTar> I can’t talk to people who show up as YP on a grey background on Trello [22:35:32] <yuvipanda> DarTar: I hung out with qchris today, and he was surprised to see me in the same outfit ;) [22:35:33] <yuvipanda> hehe [22:36:58] <DarTar> :D [22:36:59] <DarTar> it’s the famous scottish draight of 2014 [22:36:59] <DarTar> draught [22:37:00] <DarTar> aaarg drought [22:37:00] <DarTar> seriously [22:46:10] <yuvipanda> DarTar: :D [22:46:37] <DarTar> Tuesday afternoon dyslexia :-/ [22:47:53] <yuvipanda> J-Mo: CSV and TSV download implemented, I just need to add a button now [22:48:06] <J-Mo> sweet [22:48:11] <J-Mo> thanks dude [22:49:58] <yuvipanda> J-Mo: \o/. When's the webinar? [22:51:21] <J-Mo> 1500 UTC Wednesday. Same day/time as last week. [22:54:15] <yuvipanda> J-Mo: cool, I should be around [22:54:51] <J-Mo> double sweet. I'll be directing people to this chan if they have questions, as usual. [22:56:08] <yuvipanda> J-Mo: cool [22:56:22] <yuvipanda> now to add buttons [23:18:52] <yuvipanda> J-Mo: so the download buttons themselves aren't done yet, will happen tomorrow [23:18:59] <yuvipanda> I've most of the code done but am being dragged to sleep [23:20:00] <yuvipanda> J-Mo: csv output does work tho http://quarry.wmflabs.org/run/988/output/0/csv (or http://quarry.wmflabs.org/run/988/output/0/tsv since Ironholds hates CSV) [23:20:02] <J-Mo> no problem, YuviPanda. Send me a quick email if for some reason you can't finish them [23:20:06] <yuvipanda> I'll just need to add buttons [23:20:08] <yuvipanda> J-Mo: will do [23:20:08] <J-Mo> oh@ good. [23:20:11] <yuvipanda> thanks! [23:20:18] <J-Mo> nice to have a backup. Goodnight! [23:20:25] <yuvipanda> J-Mo: you can also download JSON, btw http://quarry.wmflabs.org/run/988/output/0/json [23:21:03] <yuvipanda> anyway, am off! cya [23:25:10] <DarTar> Ironholds: I was reviewing https://meta.wikimedia.org/wiki/Research:Mobile_trends and I noticed that you added a note on filtering bots for edits, but there’s no mention of filtering crawlers from the traffic dataset [23:25:34] <DarTar> I know you’ve done extensive filtering of bots, I think we should have this in the Data section of that report [23:25:49] <Ironholds> DarTar, yes! shall do. [23:25:56] <DarTar> danke [23:27:07] <Ironholds> DarTar, done! [23:27:23] <DarTar> sweet thanks [23:29:43] <Ironholds> DarTar, also, JFYI, I don't know how around I'll be tomorrow [23:29:58] <Ironholds> on account of I'm staying up all night to monitor the referer scripts and work on the apps stuff [23:30:13] <Ironholds> because I live on a couch and don't exactly have anything to do until the 31st [23:30:15] <Ironholds> ;p [23:30:41] <DarTar> sounds good to me [23:31:11] <DarTar> make sure you let da boss know [23:35:59] <YuviPanda> J-Mo: yay, got a few mins back. [23:36:04] <YuviPanda> J-Mo: and download buttons in place! http://quarry.wmflabs.org/query/354 [23:36:39] <Ironholds> DarTar, totes [23:36:44] <Ironholds> okay: noms! [23:36:51] <Ironholds> I'm being taken to something called tasty burger(?) [23:37:39] <DarTar> this being Boston, I wouldn’t get worried [23:37:48] <DarTar> YuviPanda: nice stuff [23:38:30] <J-Mo> YuviPanda: hmm, I'm not seeing the download link? [23:38:53] <J-Mo> where on the page is it 'sposed to show up? [23:38:54] <YuviPanda> J-Mo: right above the table to the right. do a hard refresh (ctrl+f5 or cmd+shift+r) [23:39:31] <J-Mo> there it is! [23:39:57] <J-Mo> THAT IS JUST BEAUTIFUL [23:40:02] <YuviPanda> J-Mo: :D [23:40:05] * J-Mo weeps from happiness [23:40:19] <DarTar> future feature request: add the full query to a title element of each row in the Recent Queries table so you can quickly preview it on hovering [23:40:21] <J-Mo> thanks for coming through in the clinch once again [23:40:28] <YuviPanda> J-Mo: \o/ [23:40:41] <J-Mo> DarTar: https://trello.com/b/fdwhYLns/quarry [23:40:50] <J-Mo> (for your feature requests) [23:40:58] <DarTar> kewl [23:42:29] <DarTar> done: https://trello.com/c/HyZSUsmU [23:42:54] <YuviPanda> J-Mo: I'll add a feature tomorrow that always downloads the latest successful run results, and then CORS headers, so people can preview things [23:42:58] <YuviPanda> err [23:43:04] <YuviPanda> people can write scripts and shit [23:43:06] <YuviPanda> I meant :) [23:43:10] <YuviPanda> I'm off now! [23:43:16] <DarTar> good night [23:44:02] <J-Mo> night! [23:55:05] <YuviPanda|zzz> J-Mo: oh well, looks like I've some more time :) anything else you want? [23:55:19] <J-Mo> nope. I'm good! thanks, though [23:55:21] <YuviPanda|zzz> J-Mo: also if you're teaching people python, I highly reccomend you ask them to use JSON than CSV/TSV. [23:55:22] <J-Mo> GO TO SLEEP!!! [23:55:32] <YuviPanda|zzz> It's only midnight! [23:55:40] <J-Mo> hehe. touche. [23:55:43] <YuviPanda|zzz> J-Mo: JSON also has no unicode problems and doesn't need an extra library [23:55:48] <YuviPanda|zzz> oh, it's actuall 1 AM [23:56:04] <YuviPanda|zzz> well, my girlfriend got sucked into a wordpress loop, so I've time until she realizes it's way past 5mins [23:56:10] <J-Mo> re: JSON: I'm going to teach people what JSON is, but will also have thme manipulating CSVs, since a lot of people are more comfortable exploring data in spreadsheets [23:58:48] <YuviPanda|zzz> J-Mo: ah, cool. don't forget unicodecsv then, since the error message it gives otherwise is super confusing - "'ascii' codec can not decode ordinal '0xfe' at position 7' or something like that [23:59:08] <J-Mo> yeah, I'm intimately familiar with that cryptic message :) [23:59:31] <YuviPanda|zzz> J-Mo: :D [23:59:54] * YuviPanda|zzz channels inner halfak [23:59:58] <YuviPanda|zzz> WE SHOULD ALL USE PYTHON3