Skip to main content

Documents Industrial Engineering

Lecture 22.pdf

Lecture 22

pdf

School

McGill University *

*We aren’t endorsed by this school

Course

MISC

Subject

Industrial Engineering

Date

Oct 30, 2023

Type

pdf

Pages

46

Uploaded by AAT1234567

Lecture 22: Multi arm bandits (Continued) + AB Testing D ATA - DRIVEN M ODELS F OR O PERATIONS A NALYTICS 1

A NNOUNCEMENTS • HW6 (Individual) is posted and is due on April 8th (Multi Arm Bandits) • I will count the best 3 individual HWs out of 4 • HW 5 grades will be posted this weekend • Next lecture will be Project Work Day 2

P ROJECT P RESENTATION • Please submit by April 5th your presentation slides (10 min-2min questions) • Discuss the background, the data, the approach and methodology, and results (if any) • We will together select top 3 presentation teams. These groups will get some awards J • You will be graded for submission of the slides. The numerical grade will be for the report. • Example of slides are posted on myCourses 3

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

P ROJECT REPORT • Please submit 4 page-report + appendices (Max 10 pages) + Code, by April 11 th • Examples of reports on myCourses 4

F INAL EXAM • Cumulative, Duration 3h • Covers: Regression, Times Series, Network Analysis, Linear and integer programming, Capacity Allocation and Opportunity costs (Bid price policies), Logical Constraints, Multi Arm bandits • Will post practice questions this weekend • Prepared for you a cheat sheet to help you summarize the course learnings ! 5

O UTLINE • Multi-armed Bandit Application - • A/B Testing at Vungle 6

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

A PPROXIMATE S OLUTIONS • Semi-random strategies – Perform a greedy action by default – Occasionally, take a random action • Probability matching strategies

T HOMPSON S AMPLING • We begin with prior belief about the arms’ reward distributions • Idea: At each trial, we choose an arm randomly , with the choice probability equal to our current belief that the chosen arm is indeed the best arm • Then we observe realized reward and update our belief (by Bayes’ rule)

T HOMPSON S AMPLING • Widely used in digital applications (where sampling is easy) • A good way to incorporate prior knowledge • Need to have relatively reliable priors or “long” enough horizon (could just be minutes in digital apps)

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

A B USINESS E XAMPLE • A store wants to test and learn about the profitability of 5 possible assortments over one year of experimentation – Each week, decide on an assortment to offer – 52 weeks in total • How should they perform the experiment?

S IMPLIFIED A SSUMPTION • Assume each assortment ࠵? ’s revenue follows a normal distribution ࠵? ࠵? ࠵? , ࠵? ࠵? • But we don’t know what the ࠵? ࠵? ’s are • So, we need to learn them as we gather more information

O BSERVATIONS • Epsilon-greedy tends to do better eventually with smaller ࠵? (rule of thumb: ࠵? = ࠵?. ࠵? ) • With no prior information, epsilon-decreasing does well with larger ࠵? to be able to explore initially • Thompson sampling performs well if you have priors and can sample from beliefs – and performs better as your estimation of priors improves

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

M ULTI - ARMED B ANDIT W RAP -U P • Exploration vs. Exploitation • Valuable approach if you need to learn while you earn: “online/real-time learning” • Very broad applications – Next week: mobile ads, recommendation systems • Epsilon-greedy/decreasing, Thompson sampling

C ASE : A/B T ESTING AT V UNGLE

C ASE B ACKGROUND • How does Vungle make money? – Video-ad platform that embeds ads in apps and games • Who are the key players in the mobile video ad market? – Users, publishers, advertisers, platform (Vungle) • What challenges does Vungle face? – Mobile in-app advertising funnel (98% fill rate, 88% completion rate, 5% click-through rate, 0.5% conversion rate); advertisers pay upon installs – How to improve conversion rate?

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

I S A LGORITHM A OR B B ETTER ? • Hypothesis testing – Null hypothesis: eRPM of A = eRPM of B – Alternative hypothesis: eRPM of B > eRPM of A • Paired t-test (paired by day) – Mean(eRPM of B – eRPM of A) = 0.11 (this is called an "effect", how much is the right effect?) – Standard error = 0.034 – p = 0.003 • eRPM of B is significantly higher than eRPM of A

H YPOTHESIS T ESTING Distribution under null hypothesis ࠵? = ࠵? ࠵? #࠵? p value p value : the probability of observing data at least as extreme as the current ones, assuming the null is true

H YPOTHESIS T ESTING Distribution under null hypothesis ࠵? = ࠵? ࠵? #࠵? p value p value : the probability of observing data at least as extreme as the current ones, assuming the null is true REJECT at 5% level ( p < 0.05)

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

S TATISTICAL P OWER OF THE T EST • Power = Prob (reject null | alternative is true) – Ability of a test to detect an effect if it exists – The higher the better – Given a significance level (e.g., 5%) and an effect size (smaller effect size requires more data) , need more data to achieve higher power • Power analysis: What is the minimum sample size to detect an effect?

P OWER A NALYSIS E X .: T - TEST • ࠵? ࠵? : eRPM of A on day ࠵? • ࠵? ࠵? : eRPM of B on day ࠵? • Null: . ࠵? − . ࠵? = ࠵? ; Alternative: . ࠵? − . ࠵? > ࠵? • Desired significance level = 5% • Effect size ࠵? : . ࠵? − . ࠵? = ࠵?. ࠵?࠵? • With 30 days of experiments, what is the power of our test? Prob(reject . ࠵? − . ࠵? = ࠵? | . ࠵? − . ࠵? = ࠵?. ࠵?࠵? )

• t statistic with ࠵? days of experiments ࠵? ࠵? = & ࠵? ࠵? ⁄ #࠵? ࠵? = ࠵? ࠵? ∑ ࠵?$࠵? ࠵? ࠵? ࠵? − ࠵? ࠵? ⁄ #࠵? ࠵? ࠵? : sample size; #࠵? : sample standard deviation • To reject null hypothesis at 5% level, need ࠵? ࠵? > ࠵?. ࠵?࠵? (use normal approx.) • Power = Prob ࠵? ࠵? > ࠵?. ࠵?࠵?|࠵? = ࠵?. ࠵?࠵? = Prob & ࠵? ࠵? − ࠵? ⁄ #࠵? ࠵? > ࠵?. ࠵?࠵? − ࠵? ⁄ #࠵? ࠵? |࠵? = ࠵?. ࠵?࠵? ≈ ࠵? − ࠵? ࠵?. ࠵?࠵? − ࠵?. ࠵?࠵? ⁄ #࠵? ࠵? ≈ ࠵?. ࠵?࠵? P OWER A NALYSIS E X .: T - TEST (TECHNICAL SLIDE)

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

P OWER A NALYSIS E X .: T - TEST (TECHNICAL SLIDE) • What is the min # of days of experiments to achieve a power of at least 0.95? • Power = ࠵? − ࠵? ࠵?. ࠵?࠵? − ࠵?.࠵?࠵? ⁄ ’࠵? ࠵? > ࠵?. ࠵?࠵? ⇒ ࠵? ≥ ࠵?࠵? • R function: – power.t.test(n = , d = , sig.level = , power = , type = c("two.sample", "one.sample", "paired")) – n is the sample size, d is the effect size (theta) , and type indicates a two-sample, one-sample, or paired t- test – Input three of the four (n, d, sig.level, power) and get the fourth as output

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

P OWER A NALYSIS E X .: T - TEST Power = ࠵? − ࠵? ࠵? − ࠵? ⁄ B࠵? ࠵? • How does power relate to sample size? – A larger sample size à higher power • How does power relate to desired significance level? – A stronger sig. level à lower power (harder to reject) à need more data • How does power relate to effect size? – A larger effect size à higher power (easier to detect) à need less data

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

C HALLENGE OF A/B T ESTING • If the effect size is small… • If we need to test many different variations… • Need many experiments – May not be able to distinguish “good” from “best” • Likely sacrifice revenue/profit! – Experiment on inferior options à do not incorporate information feedback • Solution? – Learn while you earn! à Multi-armed bandit algorithms

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

A/B T ESTING VS . M ULTI - ARMED B ANDIT • A/B testing is essentially first explore then exploit • Epsilon-first strategy in multi-armed bandit – Pure exploration (the first ࠵?࠵? rounds) followed by pure exploitation (the remaining ࠵? − ࠵? ࠵? rounds) – Arms are uniformly randomly selected in exploration – Best arm identified in exploration is always selected in exploitation

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

S TATE - OF - THE -A RT O NLINE E XPERIMENTS • Google Analytics Content Experiments – A platform for publishers to test ways to improve their websites • Use Thompson sampling to direct traffic among different variations – Update posterior beliefs on conversion rates twice daily – Variations with cumulative better performance get more traffic in the future • Benefits vs. A/B testing?

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

A S IMULATION E XAMPLE • Two versions of a website – A: original version, 4% conversion rate (CvR) – B: new version, predicted 5% CvR • If we do A/B testing on which version is better – Null hypothesis? • A’s CvR = B’s CvR – Alternative hypothesis? • A’s CvR ≠ B’s CvR • How many observations do we need?

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

P OWER A NALYSIS • Significance level: 5% • Effect size: 5% – 4 % = 1% • Power: 95% • Two-sample proportion test • R function: – power.prop.test(n = NULL, p1 = 0.04, p2 = 0.05, sig.level = 0.05, power = 0.95, alternative = “two.sided”) • Required sample size ~ 22,300 • Assume 100 obs./day, need 223 days

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

T HOMPSON S AMPLING A PPROACH • Recall from last lecture Thompson sampling 1. For each arm, sample parameters from prior beliefs 2. Choose the arm with the highest expected reward given the sampled parameters 3. Pull that arm and observe actual reward 4. Update belief for the pulled arm only (using Bayes’ rule) • Use Thompson sampling to redirect traffic – Probability matching: % of traffic to A = Prob(A CvR > B CvR)

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

T HOMPSON S AMPLING – B ERNOULLI B ANDIT • Conversion can be viewed as a Bernoulli event – Convert with prob. p à p is CvR • Online content experiment wants to learn the ࠵? ࠵? ∗ ’s for each variation ࠵? • For Bernoulli bandits, we usually use the Beta distribution as the prior • Why? – Mathematically, can compute posterior easily – Practically, Beta distribution is flexible

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

B ETA D ISTRIBUTION • Beta distribution is flexible to model a large range of actual shapes

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

P OSTERIOR FOR B ERNOULLI B ANDIT WITH B ETA P RIOR • If prior belief for an arm is Beta(1, 1) (the uniform distribution; no information) • Then posterior belief for that arm after getting ࠵? − ࠵? successes and ࠵? − ࠵? failures is Beta( ࠵?, ࠵? ) • To update, just need to keep track of number of successes (conversions) and failures (non- conversions)!

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

T HOMPSON S AMPLING FOR L ARGE O NLINE E XPERIMENTS • Don’t update after every observation, but periodically • Google Analytics: update twice daily • After each update, need to determine % of traffic going to each arm • Probability matching: % of traffic to A = Prob(A CvR > B CvR)

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

T HOMPSON S AMPLING TO D IRECT E XPERIMENT T RAFFIC • Need to estimate Prob(A CvR > B CvR) given current prior (Beta distributions) – Directly compute this prob. from the distribution is formidable • Simulate a large matrix – Each row is a random draw (i.e., a sampled CvR) – Each column is one variation (A or B) • Estimated Prob(A CvR > B CvR) = Empirical fraction of rows where column A sample > column B sample

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

T HOMPSON S AMPLING – S IMULATION • For one simulation – Black = original – Red = new • Graph shows the probability of either being optimal (sums to 1) • Stop when either hits 95% • Finish in 66 days (saved 157 days) and correctly identify the optimum

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

T HOMPSON S AMPLING – 500 S IMULATIONS On average, saved 175 days and increased 98 conversions

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

A NOTHER E XAMPLE • 6 versions of the website – Original: 4% conversion rate – Alternatives: 5%, 4.5%, 3.5%, 3%, 2% conversion rates • If we do A/B testing… – Need 91,842 obs. à 2.5 years! • If we use Thompson sampling…

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

T HOMPSON S AMPLING S IMULATION • By 100 days: inferior arms are all eliminated • Less wasteful (earn more while learning) • Learn more (more observations for better arms)

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

T HOMPSON S AMPLING – 500 S IMULATIONS On average, saved 831 days and increased 1,173 conversions

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

M ULTI - ARMED B ANDIT A PPROACH IN G OOGLE A NALYTICS • Benefits – Often produces answers more quickly than A/B testing – Efficient: Move toward winning arms without having to wait for the end of the experiment – Faster natural selection: Samples are redistributed from obviously inferior arms to better arms – More samples on better arms help separate “good” arms from “best arms” more quickly

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

T HOMPSON S AMPLING VS . A/B T ESTING FOR V UNGLE • Simulate performance of Thompson sampling • Update posterior beliefs once a day • Assume the eRPM observed each day remains the same as in the A/B testing data – Given the large numbers of impressions, this is reasonable

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

T HOMPSON S AMPLING FOR V UNGLE • Assume that the eRPM for A and B is normally distributed with known variance and unknown mean • Take normal distribution as our prior • Can calculate explicitly Prob(mean A eRPM > mean B eRPM | prior beliefs)

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

D ETERMINE T RAFFIC S PLIT • Suppose ࠵?~࠵? ࠵? ࠵? , ࠵? ࠵? ࠵? , ࠵?~࠵? ࠵? ࠵? , ࠵? ࠵? ࠵? • Prior: ࠵? ࠵? ~࠵? B࠵? ࠵?,࠵? , ࠵? ࠵? ࠵? /(࠵? + ࠵?) ࠵? ࠵? ~࠵? B࠵? ࠵?,࠵? , ࠵? ࠵? ࠵? /(࠵? + ࠵?) So Prob ࠵? ࠵? > ࠵? ࠵? |࠵?~࠵? ࠵? ࠵? , ࠵? ࠵? ࠵? , ࠵?~࠵? ࠵? ࠵? , ࠵? ࠵? ࠵? = P rob ࠵? > ࠵? where ࠵?~࠵? B࠵? ࠵?,࠵? − B࠵? ࠵?,࠵? , (࠵? ࠵? ࠵? + ࠵? ࠵? ࠵? )/(࠵? + ࠵?)

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

T HOMPSON S AMPLING FOR V UNGLE • Begin with 15/16 of traffic going to A on day 1 – To match the probability for A/B testing – Give A a small starting advantage – If B really beats A, then we expect the sample mean of A to fall below that of B with more data – Then traffic will be gradually redirected to B • At the end of each day, update B࠵? ࠵?,࠵?/࠵? and B࠵? ࠵?,࠵?/࠵? to new sample means

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

R ESULTS 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930 Fraction of TrafficDirectedto A or B by Day A fraction B fraction 2.5% higher revenue with Thompson sampling vs. A/B testing

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

V UNGLE C ASE W RAP -U P • Multi-armed bandit (Thompson sampling) is state-of-the-art for online experimentation • Learn while you earn! = Efficiency • Google Analytics: https://analytics.googleblog.com/2013/01/multi- armed-bandit- experiments.html#:~:text=Google%20Analytics%20 uses%20a%20multi,updated%20as%20the%20exper iment%20progresses

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Related Documents

EM365_7-2_MidtermReview.pdf

6. Jordan_10e_EOCExcelSolutions_Chapter08.xls

SNHU 107 Module Four Journal Template Tyrisha (1).docx

SNHU 107 Project Template.docx

ElectrostaticsI&II.pdf

MusicAndItsContextSyllabus.pdf

Amira Hill - Health History .pdf

4-2 Total Rewards Compensation.docx

TEC 100 Pizza Chain.docx

Carrier_48_ESA0360603_Product_Data_Sheets_Student_worksheet.docx

BU 632 - Week 5 Assignment .docx

Browse Popular Homework Q&A

Q: Let f(x) = 4 + x + x2 and h ≠ 0. Find f(x + h). Find (e) Find f(x+h)-f(x)/h and simplify. part…

Q: A weak monoprotic acid is 0.50% ionized in a 1.0 M solution. The dissociation constant for this weak…

Q: Examine the alkyl bromides below. Which is the most reactive as a substrate in E2 reactions with…

Q: Procedure: Review the summary of a published scientific study below: Patients with multiple…

Q: The percent by mass of phenol (MM = 94.11 g/mol) in an aqueous solution is 10.9%. What is the…

Q: How many asymmetric centers are in the molecule on the right?

Q: The following transactions are July activities of Bennett's Bowling, Inc., which operates several…

Q: 13 of 36 > Consider the pair of reactions. Draw the organic products, then predict the type of…

Q: Suppose that W is a discrete random variable and its cumulative distribution function is given in…

Q: on ni aliuzen en B. Gene A contains a base substitution that results in a nonsense mutation. HTW…

Q: Write a class called Pokemon that has a name (e.g. “Pikachu”) and a number (e.g. 25). Implement a…

Q: If the second of three consecutive integers is subtracted from 108, the results is the sum of the…

Q: The projections of the LGN to the visual cortex are easy to tract because: layers 1 and 2 are…

Q: 2) Graph the following conic. -4x² + 8x + y² - 4y + 4 = 0

Q: == What was the order of the pigments on the chromatography paper from the bottom up? Postlab 6.13…

Q: b. quantity of inputs used and quantity of output generated.

Q: What are some affordances of visual mediums?

Q: In term of the sugar identity , what is the difference between DNA and RNA. 2 What nucleobases are…

Q: If m 42 = 112°, then m 47 = 2 + 36 78 B C

Q: A = -3 0 17 1] 506 803

Q: Journal entry worksheet < 1 2 3 4 5 6 As of December 31, Lyn Addie has not been paid for four days…

Q: An electric field of strength 1.84 kv/m and a magnetic field of 0.499 T act on a moving electron to…

Q: A charge of -20 nC is placed in a region with constant electric field of 4.5 x 10^5 N/C directed in…

Q: On January 1, the long-term liability section of Cracker & Company Balance Sheet showed a balance of…

Q: 1 People who are severely obese (BMI ≥40) are at most risk for serious health problems, which are…