作者 | 梁唐
大家好,日拱一卒,我是梁唐。
咱们继续来聊聊伯克利的CS61A,这次分享的是这门课的第一个project。虽然说是project,但其实代码量不大。难度也不高,需要了解一点Python的基础语法和基本的函数式编程的思想。如果对于函数式编程还不太熟悉的,可以去历史记录里回看一下上一篇文章。
整个项目的代码量并不大,但代码质量很高,非常严谨符合规范。非常适合新手学习,既可以加深理解,也可以提高代码质量和设计规范。
由于原文档为英文,对于一些小伙伴来说可能比较吃力。所以我把它做了一个翻译,有些地方加上了一些说明。想要查看完整的代码 实现的同学可以点击【阅读原文】访问我的GitHub仓库。
项目原始文档:https://inst.eecs.berkeley.edu/~cs61a/sp18/proj/hog/
简介
我们需要实现一个小游戏Hog的核心代码,Hog是两名玩家轮流掷骰子拼点数的比拼游戏。玩家每次可以选择掷0至10颗骰子,不同的策略可以获取不同的积分,积分先到100的玩家获胜。
计算骰子点数有以下规则:
- Pig Out:如果玩家掷出的骰子当中有骰子是1点,那么该回合只能获取1点,否则获取所有骰子点数之和
- 案例1:玩家掷出7枚骰子,其中第三枚为1点,玩家此回合获取1分
- 案例2:玩家掷出4枚骰子,均为3点,玩家此回合获取12分
- Free Bacon:当玩家选择掷出0枚骰子时,触发Free Bacon规则。获取积分为对手玩家积分个位和十位分差绝对值再加2。
- 案例1:对手有42分,free bacon可以获得|4-2| 2=4分
- 案例2:对手有48分,free bacon可以获得|4-8| 2=6分
- 案例3:对手有7分,free bacon可以获得|0-7| 2=9分
- Swine Swap:当两名玩家积分均大于1,并且其中一名为另外一名倍数时,两人积分互换
- 案例1:玩家0当前有37分,对手有92分,此回合获得9分之后变成了46分,92是46的两倍。触发Swine Swap,于是两人分数互换
- 案例2:玩家0当前有91分,对手有37分,此回合获得20分,积分变成111。但由于111是37的三倍,于是触发Swine Swap,两人积分互换,玩家1拥有超过100分,玩家1获胜
整个项目的框架已经搭好,只需要我们按照要求实现其中一些具体的函数。
最后当我们实现完成所有功能以及策略,可以和指定的策略进行比拼,统计获胜率。全项目一共有三个阶段11题,可以理解成一共有11个函数需要我们实现。
每一个阶段完成之后,都会有一个阶段性成果。难度不大,但很有意思,非常推荐刚入门的同学练习。
阶段1
阶段1主要实现游戏功能的主体,完成之后我们可以通过项目里提供的gui进行双人比赛。
Problem 1:
实现roll_dice
函数,roll_dice
函数拥有两个参数,第一个参数是一个整数,表示投掷的次数。第二个参数是一个函数,表示骰子的类型。返回投掷num_rolls
次之后获得的积分,当出现Pig Out获得1分,否则为骰子点数之和。
调用dice()
函数获取骰子掷出的点数,必须要投掷num_rolls
次骰子,即使中途出现了Pig Out。
def roll_dice(num_rolls, dice=six_sided):
"""Simulate rolling the DICE exactly NUM_ROLLS > 0 times. Return the sum of
the outcomes unless any of the outcomes is 1. In that case, return 1.
num_rolls: The number of dice rolls that will be made.
dice: A function that simulates a single dice roll outcome.
"""
# These assert statements ensure that num_rolls is a positive integer.
assert type(num_rolls) == int, 'num_rolls must be an integer.'
assert num_rolls > 0, 'Must roll at least once.'
# BEGIN PROBLEM 1
"*** YOUR CODE HERE ***"
# END PROBLEM 1
我们只需要使用循环重复执行num_rolls
次,然后判断一下当中有没有出现1即可。
def roll_dice(num_rolls, dice=six_sided):
"""Simulate rolling the DICE exactly NUM_ROLLS > 0 times. Return the sum of
the outcomes unless any of the outcomes is 1. In that case, return 1.
num_rolls: The number of dice rolls that will be made.
dice: A function that simulates a single dice roll outcome.
"""
# These assert statements ensure that num_rolls is a positive integer.
assert type(num_rolls) == int, 'num_rolls must be an integer.'
assert num_rolls > 0, 'Must roll at least once.'
# BEGIN PROBLEM 1
"*** YOUR CODE HERE ***"
ret = 0
pigout = False
for _ in range(num_rolls):
score = dice()
if score == 1:
pigout = True
continue
ret = score
return 1 if pigout else ret
# END PROBLEM 1
Problem 2:
实现free_bacon
函数,返回当掷出0颗骰子时获取的点数。输入为对手玩家的分数,可以假设分数一定小于100,如果分数小于10,那么十位视作0。
def free_bacon(score):
"""Return the points scored from rolling 0 dice (Free Bacon).
score: The opponent's current score.
"""
assert score < 100, 'The game should be over.'
# BEGIN PROBLEM 2
"*** YOUR CODE HERE ***"
# END PROBLEM 2
计算出分数的十位和个位,作差之后求绝对值,再加上2即可。
代码语言:javascript复制def free_bacon(score):
"""Return the points scored from rolling 0 dice (Free Bacon).
score: The opponent's current score.
"""
assert score < 100, 'The game should be over.'
# BEGIN PROBLEM 2
"*** YOUR CODE HERE ***"
return abs(score // 10 - score % 10) 2
# END PROBLEM 2
Problem 3:
实现take_turn
函数,给定num_rolls
和dice
,返回玩家此回合获得的分数。
你需要先实现free_bacon
函数,你实现的take_turn
函数当中必须要调用roll_dice
和free_bacon
函数。
def take_turn(num_rolls, opponent_score, dice=six_sided):
"""Simulate a turn rolling NUM_ROLLS dice, which may be 0 (Free Bacon).
Return the points scored for the turn by the current player.
num_rolls: The number of dice rolls that will be made.
opponent_score: The total score of the opponent.
dice: A function that simulates a single dice roll outcome.
"""
# Leave these assert statements here; they help check for errors.
assert type(num_rolls) == int, 'num_rolls must be an integer.'
assert num_rolls >= 0, 'Cannot roll a negative number of dice in take_turn.'
assert num_rolls <= 10, 'Cannot roll more than 10 dice.'
assert opponent_score < 100, 'The game should be over.'
# BEGIN PROBLEM 3
"*** YOUR CODE HERE ***"
# END PROBLEM 3
根据num_rolls
判断是投掷骰子,还是执行free bacon
。只要看懂了题意,同样没有难度。
def take_turn(num_rolls, opponent_score, dice=six_sided):
"""Simulate a turn rolling NUM_ROLLS dice, which may be 0 (Free Bacon).
"""
# Leave these assert statements here; they help check for errors.
assert type(num_rolls) == int, 'num_rolls must be an integer.'
assert num_rolls >= 0, 'Cannot roll a negative number of dice in take_turn.'
assert num_rolls <= 10, 'Cannot roll more than 10 dice.'
assert opponent_score < 100, 'The game should be over.'
# BEGIN PROBLEM 3
"*** YOUR CODE HERE ***"
if num_rolls == 0:
return free_bacon(opponent_score)
else:
return roll_dice(num_rolls, dice)
# END PROBLEM 3
Problem 4:
实现is_swap
函数,返回当前玩家的分数是否会触发交换
def is_swap(score0, score1):
"""Return whether one of the scores is an integer multiple of the other."""
# BEGIN PROBLEM 4
"*** YOUR CODE HERE ***"
# END PROBLEM 4
考虑分数为0的情况,分数为0时不应该触发is_swap
。
判断两个人的分数是否成倍数关系即可:
代码语言:javascript复制def is_swap(score0, score1):
"""Return whether one of the scores is an integer multiple of the other."""
# BEGIN PROBLEM 4
"*** YOUR CODE HERE ***"
if score0 <= 1 or score1 <= 1:
return False
return score0 % score1 == 0 or score1 % score0 == 0
# END PROBLEM 4
Problem 5:
实现play
函数,它模拟了一整局游戏。玩家轮流行动,每次行动时都会使用它们指定的策略。如玩家0,使用的策略就是strategy0
,直到有玩家分数达到目标goal
。当游戏结束时,按顺序返回两名玩家的分数,玩家0在先,玩家1在后。
提示:
- 必须要调用刚刚开发的函数,需要调用
take_turn
并传入对应的三个参数 - 每一回合只能调用一次
take_turn
- 考虑所有边界情况
- 当回合结束时,你可以调用函数
other
,获取下一名玩家 - 你可以无视
say
参数 strategy0
和strategy1
本身都是函数,传入当前玩家和对手玩家的分数之后,它会返回玩家需要投掷的骰子数量。每回合只能调用strategy
一次。
def play(strategy0, strategy1, score0=0, score1=0, dice=six_sided,
goal=GOAL_SCORE, say=silence):
"""Simulate a game and return the final scores of both players, with Player
0's score first, and Player 1's score second.
A strategy is a function that takes two total scores as arguments (the
current player's score, and the opponent's score), and returns a number of
dice that the current player will roll this turn.
strategy0: The strategy function for Player 0, who plays first.
strategy1: The strategy function for Player 1, who plays second.
score0: Starting score for Player 0
score1: Starting score for Player 1
dice: A function of zero arguments that simulates a dice roll.
goal: The game ends and someone wins when this score is reached.
say: The commentary function to call at the end of the first turn.
"""
player = 0 # Which player is about to take a turn, 0 (first) or 1 (second)
# BEGIN PROBLEM 5
"*** YOUR CODE HERE ***"
# END PROBLEM 5
照着题目意思实现逻辑即可,基本上没有难度。
代码语言:javascript复制def play(strategy0, strategy1, score0=0, score1=0, dice=six_sided,
goal=GOAL_SCORE, say=silence):
"""Simulate a game and return the final scores of both players, with Player
0's score first, and Player 1's score second.
"""
player = 0 # Which player is about to take a turn, 0 (first) or 1 (second)
# BEGIN PROBLEM 5
"*** YOUR CODE HERE ***"
while True:
if score0 >= goal or score1 >= goal:
break
if player == 0:
num = strategy0(score0, score1)
score0 = take_turn(num, score1, dice)
else:
num = strategy1(score1, score0)
score1 = take_turn(num, score0, dice)
if is_swap(score0, score1):
score0, score1 = score1, score0
player = other(player)
# END PROBLEM 5
return score0, score1
阶段2
阶段2主要实现一些评论函数,可以在一些关键节点打印出日志给玩家以提示。
类似这种"22 points! That's the biggest gain yet for Player 1."
,玩家1打破了记录,获得了22分。
老师给了评论函数的一个例子:
代码语言:javascript复制def say_scores(score0, score1):
"""A commentary function that announces the score for each player."""
print("Player 0 now has", score0, "and Player 1 now has", score1)
return say_scores
注意这个say_scores
函数最后返回了它自己,我们每一轮调用的都是同样的函数。
announce_lead_changes
是使用函数式编程返回领先玩家的例子:
def announce_lead_changes(previous_leader=None):
"""Return a commentary function that announces lead changes.
>>> f0 = announce_lead_changes()
>>> f1 = f0(5, 0)
Player 0 takes the lead by 5
>>> f2 = f1(5, 12)
Player 1 takes the lead by 7
>>> f3 = f2(8, 12)
>>> f4 = f3(8, 13)
>>> f5 = f4(15, 13)
Player 0 takes the lead by 2
"""
def say(score0, score1):
if score0 > score1:
leader = 0
elif score1 > score0:
leader = 1
else:
leader = None
if leader != None and leader != previous_leader:
print('Player', leader, 'takes the lead by', abs(score0 - score1))
return announce_lead_changes(leader)
return say
在开发之前,你需要先看懂下面这个both
函数:
def both(f, g):
"""Return a commentary function that says what f says, then what g says.
>>> h0 = both(say_scores, announce_lead_changes())
>>> h1 = h0(10, 0)
Player 0 now has 10 and Player 1 now has 0
Player 0 takes the lead by 10
>>> h2 = h1(10, 6)
Player 0 now has 10 and Player 1 now has 6
>>> h3 = h2(6, 18) # Player 0 gets 8 points, then Swine Swap applies
Player 0 now has 6 and Player 1 now has 18
Player 1 takes the lead by 12
"""
def say(score0, score1):
return both(f(score0, score1), g(score0, score1))
return say
它接收两个评论函数f
和g
,并且返回一个全新的评论函数。这个返回的函数中的f
和g
这两个参数,又是调用f
和g
之后获得的结果。
这一段涉及函数式编程的嵌套,会显得有些绕,最好根据这个例子理解:
代码语言:javascript复制>>> h0 = both(say_scores, announce_lead_changes())
>>> h1 = h0(10, 0)
Player 0 now has 10 and Player 1 now has 0
Player 0 takes the lead by 10
>>> h2 = h1(10, 6)
Player 0 now has 10 and Player 1 now has 6
>>> h3 = h2(6, 18) # Player 0 gets 8 points, then Swine Swap applies
Player 0 now has 6 and Player 1 now has 18
Player 1 takes the lead by 12
Problem 6:
更新的你play
函数,使得在每一个回合的结束都调用一下评论函数。评论函数调用之后会返回一个新的评论函数用在下一个回合。
比如说say(score0, score1)
将在第一个回合结束时调用,它会返回另外一个评论函数,返回的这个函数需要在第二个回合调用。也就是说每个回合调用的评论函数都是上一个回合返回的。
看懂了say
函数之后,我们只需要在play
函数当中加上say
方法的调用即可。
由于say
函数是一个高阶函数,它运行之后会返回一个函数,注意这点,不难实现。
def play(strategy0, strategy1, score0=0, score1=0, dice=six_sided,
goal=GOAL_SCORE, say=silence):
"""Simulate a game and return the final scores of both players, with Player
0's score first, and Player 1's score second.
"""
player = 0 # Which player is about to take a turn, 0 (first) or 1 (second)
# BEGIN PROBLEM 5
"*** YOUR CODE HERE ***"
while True:
if score0 >= goal or score1 >= goal:
break
if player == 0:
num = strategy0(score0, score1)
score0 = take_turn(num, score1, dice)
else:
num = strategy1(score1, score0)
score1 = take_turn(num, score0, dice)
if is_swap(score0, score1):
score0, score1 = score1, score0
player = other(player)
say = say(score0, score1)
# END PROBLEM 5
return score0, score1
Problem 7:
实现announce_highest
函数,它是一个高阶函数,返回一个评论函数。这个评论函数会在指定玩家单个回合获得新的最大得分时打印日志。要实现这点,它必须要计算当前得分,并且和历史最高得分进行比较。函数的第一个参数who
指定了需要跟踪记录的玩家,其他玩家的得分可以忽略。
需要注意,在打印得分时,需要注意单复数。得分大于1时,使用points
,只有1分时,使用point
。
def announce_highest(who, previous_high=0, previous_score=0):
"""Return a commentary function that announces when WHO's score
increases by more than ever before in the game.
>>> f0 = announce_highest(1) # Only announce Player 1 score gains
>>> f1 = f0(11, 0)
>>> f2 = f1(11, 1)
1 point! That's the biggest gain yet for Player 1
>>> f3 = f2(20, 1)
>>> f4 = f3(5, 20) # Player 1 gets 4 points, then Swine Swap applies
19 points! That's the biggest gain yet for Player 1
>>> f5 = f4(20, 40) # Player 0 gets 35 points, then Swine Swap applies
20 points! That's the biggest gain yet for Player 1
>>> f6 = f5(20, 55) # Player 1 gets 15 points; not enough for a new high
"""
assert who == 0 or who == 1, 'The who argument should indicate a player.'
# BEGIN PROBLEM 7
"*** YOUR CODE HERE ***"
# END PROBLEM 7
这题稍稍有些难,多关注一下注释中的测试样例。
主要思路是在announce_highest
中实现一个函数,我这里函数名叫say
。然后通过这两个函数的互相调用来实现功能。
如果想不明白可以看一下公开课的视频,老师专门开了一个小节讲这题。
代码语言:javascript复制def announce_highest(who, previous_high=0, previous_score=0):
"""Return a commentary function that announces when WHO's score
increases by more than ever before in the game.
"""
assert who == 0 or who == 1, 'The who argument should indicate a player.'
# BEGIN PROBLEM 7
"*** YOUR CODE HERE ***"
def say(score0, score1):
if who == 0:
gain = score0 - previous_score
new_score = score0
else:
gain = score1 - previous_score
new_score = score1
new_high = previous_high
if gain > previous_high:
new_high = gain
print("{} {}! That's the biggest gain yet for Player {}".format(gain, 'points' if gain > 1 else 'point', who))
return announce_highest(who, new_high, new_score)
return say
# END PROBLEM 7
阶段3
在第三个阶段,需要实现一些游戏策略,设计一些简单的ai。
Problem 8:
实现make_averaged
函数,它是一个高阶函数。接收一个参数fn
,fn
是一个函数。make_averaged
函数返回一个函数,它接收和fn
同样的参数。这个返回的参数和fn
不同,它返回调用fn
函数num_samples
之后结果的均值。
说起来有些拗口,稍微解释一下。即我们要实现一个高阶函数make_averaged
,它返回一个函数,我们把这个返回的函数称为r
。r
函数接收的参数和fn
一样,调用r
会得到将fn
函数调用num_samples
次之后的均值。
这里最大的问题在于我们不知道fn
会接收什么样的参数,那又怎么来确定r
的接收参数格式呢?
这里需要用到一个小技巧,在Python传参的过程当中,当我们不知道具体传入的参数数量时,我们可以把若干个连续的必填参数写成*arg
。这里的arg
表示一个列表,*arg
表示将列表展开。其实*
符号后面跟什么名字都行,都表示一个列表名。
作业当中提供了一个例子:
代码语言:javascript复制>>> def printed(fn):
... def print_and_return(*args):
... result = fn(*args)
... print('Result:', result)
... return result
... return print_and_return
>>> printed_pow = printed(pow)
>>> printed_pow(2, 8)
Result: 256
256
>>> printed_abs = printed(abs)
>>> printed_abs(-10)
Result: 10
10
在这个例子当中,print_and_return
函数接收的参数就是*args
,它可以接收若干个参数。在print_and_return
函数当中,我们又调用了fn
函数,将*args
又传给了fn
。通过这种方式,其实就把fn
函数的接收参数通过print_and_return
传递了进来。
在本题当中,我们也需要使用类似的方法实现:
代码语言:javascript复制def make_averaged(fn, num_samples=1000):
"""Return a function that returns the average value of FN when called.
"""
# BEGIN PROBLEM 8
"*** YOUR CODE HERE ***"
def process(*args):
tot = 0
for _ in range(num_samples):
tot = fn(*args)
return tot / num_samples
return process
# END PROBLEM 8
Problem 9:
实现max_scoring_num_rolls
函数,它接收两个参数dice
和num_samples
,其中dice
表示骰子的类型,num_samples
表示实验的次数。它返回num_samples
实验之后,收益均值最大的骰子的数量。
你需要调用你刚刚实现的make_averaged
和roll_dice
函数来完成。返回的结果在1到10之间(包括1和10)。当两种情况得分相同时,返回骰子数量少的。比如3个骰子和6个骰子的得分相同,返回3。
只要开发出了make_averaged
,基本上没有难度,枚举一下所有骰子的可能,选出平均收益(收益期望)最大的即可。
def max_scoring_num_rolls(dice=six_sided, num_samples=1000):
"""Return the number of dice (1 to 10) that gives the highest average turn
score by calling roll_dice with the provided DICE over NUM_SAMPLES times.
Assume that the dice always return positive outcomes.
>>> dice = make_test_dice(1, 6)
>>> max_scoring_num_rolls(dice)
1
"""
# BEGIN PROBLEM 9
"*** YOUR CODE HERE ***"
ret, max_score = 1, 0
for i in range(1, 11):
avearge_dice = make_averaged(roll_dice, num_samples)
score = avearge_dice(i, dice)
if score > max_score:
ret, max_score = i, score
return ret
# END PROBLEM 9
Problem 10:
实现bacon_strategy
函数,这个策略会在投掷0收益最优时尽可能投掷0颗骰子。
具体逻辑为:当投掷0颗骰子的收益大于等参数margin
时,投掷0颗骰子,否则投掷num_rolls
颗。
def bacon_strategy(score, opponent_score, margin=8, num_rolls=4):
"""This strategy rolls 0 dice if that gives at least MARGIN points, and
rolls NUM_ROLLS otherwise.
"""
# BEGIN PROBLEM 10
gain = free_bacon(opponent_score)
if gain >= margin:
return 0
return num_rolls
# END PROBLEM 10
Problem 11:
实现swap_strategy
策略,这个策略会尽可能利用Swine Swap规则。当投掷0会触发Swine Swap,并且获得很大收益时投掷0。如果投掷0颗收益大于等于参数margin,并且不会触发Swine Swap时也投掷0。否则投掷num_rolls
。
def swap_strategy(score, opponent_score, margin=8, num_rolls=4):
"""This strategy rolls 0 dice when it triggers a beneficial swap. It also
rolls 0 dice if it gives at least MARGIN points. Otherwise, it rolls
NUM_ROLLS.
"""
# BEGIN PROBLEM 11
gain = free_bacon(opponent_score)
score = gain
if gain >= margin and not is_swap(score, opponent_score):
return 0
if is_swap(score, opponent_score) and opponent_score >= score margin:
return 0
return num_rolls
# END PROBLEM 11
Problem 12:
附加题,发挥你个人的才智将上述实现的策略进行结合实现终极策略,尽可能和在始终投掷4的ai比拼当中获得更高的胜率。一些建议:
swap_strategy
是一个很好的默认策略- 获得超过100分没有意义,查看能否通过掷出0、1、2枚骰子获胜,如果领先,可以投掷少一些的骰子降低风险
- 尝试获得有收益的交换
- 谨慎选择
num_rolls
和margin
这题没有标准答案,大家可以自由发挥。提供一个我根据提示实现的版本:
代码语言:javascript复制def final_strategy(score, opponent_score):
"""Write a brief description of your final strategy.
*** YOUR DESCRIPTION HERE ***
"""
# BEGIN PROBLEM 12
# 落后时,如果获得1分可以与对手交换,那么就投10个骰子
if opponent_score > score and is_swap(score 1, opponent_score):
return 10
# 领先时,如果获得1分会被交换,返回0
if score > opponent_score and is_swap(score 1, opponent_score):
return 0
# 判断free bacon是否可以获胜
max_roll = max_scoring_num_rolls(num_samples=100)
num = swap_strategy(score, opponent_score, 100-score, max_roll)
if num == 0:
return 0
# 不投0颗,是否有获胜可能
avg_dice = make_averaged(roll_dice, 100)
for i in range(1, 4):
gain = avg_dice(i)
if score gain >= 100:
return i
# 判断投掷0颗是否可以反超,如果领先,则以获得6分以上为期望
margin = opponent_score - score 1 if opponent_score > score else 6
return swap_strategy(score, opponent_score, margin, max_roll)
# END PROBLEM 12
到这里整个项目就算是开发完了,我们可以调用一下老师提供的工具,计算一下我们开发的策略对战一直投掷4个骰子策略的胜率。
目前我在基于规则的情况下只能获得60%左右的胜率,我后来使用强化学习训练了模型,但是最好情况下也只获得了61%的胜率,比基于策略的略高一点而已。大家不妨也试试看,看看有没有更好的策略。
我们也可以调用现成的gui界面,和我们开发的策略对战:
代码语言:javascript复制python3 hog_gui.py -f
这个游戏看起来虽然简单,但是完整地顺着老师的思路把代码一点点实现下来,还是很有成就感的,并且难度梯度做得很好,虽然项目本身不小,但实现难度并不大。
另外,游戏本身也很值得我们深挖,由于Pig Out规则的存在,导致收益存在一个巨大的波动。投掷的骰子多了,很容易影响收益。投的少了,收益也少。在这种情况下想要设计出一个厉害的ai还是挺有难度的。非常非常强烈推荐大家亲自试试。
好了,关于这个项目就先聊到这里,感谢大家的阅读。