h
2014-07-28 14:10:59 UTC
ç¶åºåªé«å¹ŽåºŠå€§æå°éå§ïŒåæææHackathonïŒæ³çµç¹äžåé è·è·šåç掻åïŒ
ç®æšæ¯è®åèè éçšçŸæçšåŒç¢Œåè³æäŸåæGLAMçåççé»æçååäœã
åžæ倧家åäžäžæåšååžæ¯åŠæ人é¡æåžç¿äžŠå¯Šäœçã
---------- Forwarded message ----------
From: h <hanteng-***@public.gmane.org>
Date: 2014-07-28 9:19 GMT+01:00
Subject: Re: [Wiki-research-l] [Wikitech-l] Tech Talk: Hadoop and Beyond.
An overview of Analytics infrastructure, Tuesday!
To: Research into Wikimedia content and communities <
wiki-research-l-RusutVdil2icGmH+5r0DM0B+***@public.gmane.org>
Cc: Wikimedia developers <wikitech-l-RusutVdil2icGmH+5r0DM0B+***@public.gmane.org>, "A mailing list
for the Analytics Team at WMF and everybody who has an interest in
Wikipedia and analytics." <analytics-RusutVdil2icGmH+5r0DM0B+***@public.gmane.org>
Dear wiki-research-l and wiki-tech-l members,
Specifically on Nuria Ruiz and Andrew Otto's talk on July 15th on the
NARA analytics pilot: Commons:GLAMwiki_Toolset_Project/NARA_analytics_pilot
<https://commons.wikimedia.org/wiki/Commons:GLAMwiki_Toolset_Project/NARA_analytics_pilot>,
I wonder whether it is possible to duplicate this for other GLAM
institutions so as to expand its global GLAM outreach.
I plan to compare/contrast two to four GLAM institutions that host
substantial Chinese collections in China, Taiwan, the U.K. I hope that it
can be turned into a hackathon event to recruit coders and researchers from
the Chinese-speaking regions.
Another benefit to duplicate NARA analytics pilot is to demonstrate
the possible data-research workflow using the data and infrastructure
provided by the Wikimedia Foundation.
I am not sure if the following list contain all the tasks involved and
the time needed to finish them (Please give me your estimate, THANKS):
# Identify images to be logged for visiting traffic.
# Get log data permission (Is it gonna be difficult?)
# Start logging
# Visualize using glam-metrics (http://glam-metrics.wmflabs.org/)
# Localize the stats report with Chinese locale/translation
# Customize glam-metrics so that more than one GLAM institution can be
compared.
Please also let me know if I miss anything. Many thanks.
Best,
--
ç¶åºçŸç§ - 人人å¯ç·šèŒ¯çèªç±çŸç§
é é éµ ä»¶ å æ° è çµ çŒ èš ï¹çŒ èš è« çš çŽ æ æ¬ï¹ :
zh_wikipedia ï¹ïŒ¡ïŒŽï¹ googlegroups ï¹ïŒ€ïŒ¯ïŒŽï¹ com
news://news.gmane.org/gmane.org.user-groups.wikipedia.chinese
ç¶ åº çŸ ç§ çŸ€ çµ ç 芜 èš é± è å æ¶ :
ï¹ æ° ï¹ httpS://groups.google.com/forum/#forum/zh_wikipedia
httpS://groups.google.com/group/zh_wikipedia/subscribe
httpS://groups.google.com/group/zh_wikipedia/boxsubscribe
---
æšå·²èšé±ãGoogle 網äžè«å£ãçãzh.wikipediaã矀çµïŒå æ€æåç¹å¥å³ééå°éµä»¶éç¥æšã
åŠèŠåæ¶èšé±éå矀çµäžŠåæ¢æ¥æ¶äŸèªéå矀çµçéµä»¶ïŒè«å³éé»åéµä»¶å° zh_wikipedia+unsubscribe-/***@public.gmane.orgã
åŠèŠåšç¶²è·¯äžæ¥çéé èšè«ïŒè«é 蚪 https://groups.google.com/d/msgid/zh_wikipedia/CAOnynsaeaxWW54XT2Wx0L52p67UQa55mgCakpeGh%2BYWM7qEXQg%40mail.gmail.comã
åŠéæŽå€éžé ïŒè«ååŸïŒhttps://groups.google.com/d/optoutã
ç®æšæ¯è®åèè éçšçŸæçšåŒç¢Œåè³æäŸåæGLAMçåççé»æçååäœã
åžæ倧家åäžäžæåšååžæ¯åŠæ人é¡æåžç¿äžŠå¯Šäœçã
---------- Forwarded message ----------
From: h <hanteng-***@public.gmane.org>
Date: 2014-07-28 9:19 GMT+01:00
Subject: Re: [Wiki-research-l] [Wikitech-l] Tech Talk: Hadoop and Beyond.
An overview of Analytics infrastructure, Tuesday!
To: Research into Wikimedia content and communities <
wiki-research-l-RusutVdil2icGmH+5r0DM0B+***@public.gmane.org>
Cc: Wikimedia developers <wikitech-l-RusutVdil2icGmH+5r0DM0B+***@public.gmane.org>, "A mailing list
for the Analytics Team at WMF and everybody who has an interest in
Wikipedia and analytics." <analytics-RusutVdil2icGmH+5r0DM0B+***@public.gmane.org>
Dear wiki-research-l and wiki-tech-l members,
Specifically on Nuria Ruiz and Andrew Otto's talk on July 15th on the
NARA analytics pilot: Commons:GLAMwiki_Toolset_Project/NARA_analytics_pilot
<https://commons.wikimedia.org/wiki/Commons:GLAMwiki_Toolset_Project/NARA_analytics_pilot>,
I wonder whether it is possible to duplicate this for other GLAM
institutions so as to expand its global GLAM outreach.
I plan to compare/contrast two to four GLAM institutions that host
substantial Chinese collections in China, Taiwan, the U.K. I hope that it
can be turned into a hackathon event to recruit coders and researchers from
the Chinese-speaking regions.
Another benefit to duplicate NARA analytics pilot is to demonstrate
the possible data-research workflow using the data and infrastructure
provided by the Wikimedia Foundation.
I am not sure if the following list contain all the tasks involved and
the time needed to finish them (Please give me your estimate, THANKS):
# Identify images to be logged for visiting traffic.
# Get log data permission (Is it gonna be difficult?)
# Start logging
# Visualize using glam-metrics (http://glam-metrics.wmflabs.org/)
# Localize the stats report with Chinese locale/translation
# Customize glam-metrics so that more than one GLAM institution can be
compared.
Please also let me know if I miss anything. Many thanks.
Best,
Thanks for this. Forwarding to Analytics and Research for others who are
curious.
Pine
Wiki-research-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
--curious.
Pine
This Tech Talk will be starting in 30 minuets. Thanks!
Wikitech-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________Hello!
Please join Nuria Ruiz and Andrew Otto next Tuesday, July 15th at 10am
SFPlease join Nuria Ruiz and Andrew Otto next Tuesday, July 15th at 10am
time/5pm UTC
<
http://www.timeanddate.com/worldclock/fixedtime.html?msg=Analytics+Tech+Talk&iso=20140715T10&p1=224&am=30<
for a 30 min tech talk. You can join our hangout or follow along on
https://plus.google.com/u/0/b/103470172168784626509/events/c53ho5esd0luccd09a1c30rlrmg(please note that a link to join the hangout will be posted in the
commentsof this event just as it starts).
You can follow ask questions on IRC during the talk in #wikimedia-dev.
If you are not able to follow along live, a video recording will be
postedYou can follow ask questions on IRC during the talk in #wikimedia-dev.
If you are not able to follow along live, a video recording will be
here
<
https://plus.google.com/u/0/b/103470172168784626509/103470172168784626509/videos<
,
to the MediaWiki YouTube channel immediately following the tech talk for
you to view at any time.
*Hadoop and Beyond. An overview of Analytics infrastructure*In this tech
talk we will be presenting the analytics infrastructure that we have
recently rolled out in production. By now probably everybody knows that
wikimedia hosts an instance of hadoop from which we are going to extract
pageview data in the near future. But .. how exactly does the data get
there?
We will go over the path that webrequest log data takes from varnish to
kafka (a distributed log buffer) to hadoop and the challenges of
deployingto the MediaWiki YouTube channel immediately following the tech talk for
you to view at any time.
*Hadoop and Beyond. An overview of Analytics infrastructure*In this tech
talk we will be presenting the analytics infrastructure that we have
recently rolled out in production. By now probably everybody knows that
wikimedia hosts an instance of hadoop from which we are going to extract
pageview data in the near future. But .. how exactly does the data get
there?
We will go over the path that webrequest log data takes from varnish to
kafka (a distributed log buffer) to hadoop and the challenges of
this java-based infrastructure in production. We will also talk about
howcan we query the data with hive, an SQL-like interface. How can you set
upthis stack on vagrant to play with and, last but not least, how we used
https://commons.wikimedia.org/wiki/Commons:GLAMwiki_Toolset_Project/NARA_analytics_pilotThanks!
_______________________________________________Wikitech-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wiki-research-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
--
ç¶åºçŸç§ - 人人å¯ç·šèŒ¯çèªç±çŸç§
é é éµ ä»¶ å æ° è çµ çŒ èš ï¹çŒ èš è« çš çŽ æ æ¬ï¹ :
zh_wikipedia ï¹ïŒ¡ïŒŽï¹ googlegroups ï¹ïŒ€ïŒ¯ïŒŽï¹ com
news://news.gmane.org/gmane.org.user-groups.wikipedia.chinese
ç¶ åº çŸ ç§ çŸ€ çµ ç 芜 èš é± è å æ¶ :
ï¹ æ° ï¹ httpS://groups.google.com/forum/#forum/zh_wikipedia
httpS://groups.google.com/group/zh_wikipedia/subscribe
httpS://groups.google.com/group/zh_wikipedia/boxsubscribe
---
æšå·²èšé±ãGoogle 網äžè«å£ãçãzh.wikipediaã矀çµïŒå æ€æåç¹å¥å³ééå°éµä»¶éç¥æšã
åŠèŠåæ¶èšé±éå矀çµäžŠåæ¢æ¥æ¶äŸèªéå矀çµçéµä»¶ïŒè«å³éé»åéµä»¶å° zh_wikipedia+unsubscribe-/***@public.gmane.orgã
åŠèŠåšç¶²è·¯äžæ¥çéé èšè«ïŒè«é 蚪 https://groups.google.com/d/msgid/zh_wikipedia/CAOnynsaeaxWW54XT2Wx0L52p67UQa55mgCakpeGh%2BYWM7qEXQg%40mail.gmail.comã
åŠéæŽå€éžé ïŒè«ååŸïŒhttps://groups.google.com/d/optoutã