dataWebViewOrange19.csv (52.79 MB)

Detecting Degradation of Web Browsing Quality of Experience

Download (52.79 MB)
dataset
posted on 02.11.2020, 13:27 by Alexis Huet, Zied Ben Houidi, Bertrand Mathieu, Dario Rossi
This dataset represents 222k samples of web browsing session measurements collected during 2.5 months using the Web View platform (https://webview.orange.com) [1]. Web View allows different probes to automatically execute multiple web sessions in a real end-user environment. In our test campaign, we use 17 machines, spread in three different locations worldwide (Lannion, Paris and Mauritius islands), different ISPs and access technologies (ADSL, WiFi and fiber) for a total of 9 combinations, and up to 12 browser versions, which include various versions of Chrome and Firefox. Each machine can request a different browser viewport, can enable or disable the AdBlock plugin to emulate different user preferences and can request a specific network protocol (HTTP/1, HTTP/2 or QUIC).

We leverage this dataset to phrase the QoE degradation detection issue as a change point detection problem in [2]. Our results, beyond showing feasibility, warn about the exclusive use of QoE indicators that are very close to content, as changes in the content space can lead to false alarms that are not tied to network-related problems.

If you use these datasets in your research, you can reference the appropriate papers:

[1] A. Saverimoutou, B. Mathieu, and S. Vaton, “Web View: A measurement platform for depicting web browsing performance and delivery,” IEEE Communications Magazine, vol. 58, no. 3, pp. 33–39, 2020.
[2] A. Huet, Z. Ben Houidi, B. Mathieu, D. Rossi “Detecting degradation of web browsing quality of experience,” 16th International Conference on Network and Service Management (CNSM), 2020.

Each row represents one experiment, and the columns are as follows:
- wwwName: Target page
- timestamp: Timestamp with format YYYY-MM-DD hh:mm:ss
- browserUsed: Internet browser and version
- requestedProtocol: Requested L7 protocol
- adBlocker: Whether adBlocker is used or not
- networkIface: Network interface
- winSize: Window size
- visiblePortion: Visible portion of the page that is above the fold in percents
- h1Share: Share of the traffic coming from HTTP/1 in percents
- h2Share: Share of the traffic coming from HTTP/2 in percents
- hqShare: Share of the traffic coming from QUIC in percents
- pushShare: Share of the traffic coming from HTTP/2 Server Push in percents
- nbRes: Number of objects of the page
- nbResNA: Number of objects coming from North America
- nbResSA: Number of objects coming from South America
- nbResEU: Number of objects coming from Europe
- nbResAS: Number of objects coming from Asia
- nbResAF: Number of objects coming from Africa
- nbResOC: Number of objects coming from Oceania
- nbResUKN: Number of objects coming from unknown provenance
- nbHTTPS: Number of objects coming from an HTTPS connection
- nbHTTP: Number of objects coming from an HTTP connection
- nbDomNA: Number of different domain names coming from North America
- nbDomSA: Number of different domain names coming from South America
- nbDomEU: Number of different domain names coming from Europe
- nbDomAS: Number of different domain names coming from Asia
- nbDomAF: Number of different domain names coming from Africa
- nbDomOC: Number of different domain names coming from Oceania
- firstPaint: First paint time (ms)
- tfvr: Time for Full Visual Rendering (ms)
- dom: DOM time (ms)
- plt: Page Load Time (ms)
- machine: Machine name (containing location information)
- categoryType: Category of the web page
- pageSize: Total web page size (bytes)
- receiveTime: Total receive time from HAR (ms)
- transferRate: Transfer rate (bps)
- id: Unique identification of the current experiment
- config: Identification for the tuple (browserUsed, requestedProtocol, adBlocker, networkIface, winSize, machine, wwwName), i.e. the probe configuration with target wwwName

History

Licence

Exports