วิธีคิดแบบนักวิทยาการคอมพิวเตอร์/Stacks

บทที่ 18 Stacks

18.1 Abstract data types

ชนิดข้อมูลที่คุณเคยเห็นมาทั้งหมดเป็นแบบรูปธรรม ซึ่งเราสามารถเข้าใจถึงมันได้โดยง่าย เป็นต้นว่าคลาส Card แทนตัว card ด้วย integers 2 ตัว ดังที่เคยได้กล่าวไว้แล้วมันไม่ใช่ทางเดียวที่จะแทนตัว card มันยังมีทางเลือกอื่นอีกหลายทางที่จะทำการ implementations

abstract data type หรือ ADT คือการระบุชุดของ operations หรือ method และความหมายของ operations ( สามารถทำอะไรได้ ) แต่ไม่ได้ระบุถึง implementation ของ operations นั่นจึงเป็นสิ่งที่ทำให้มันเป็นนามธรรม

ทำไมจึงมีประโยชน์

• มันเป็นการง่ายที่คุณเพียงแค่เจาะจงเฉพาะ algorithm ถ้าคุณสามารถใช้ operations ที่ต้องการโดยไม่ต้องคิดว่าจะต้องสร้างการทำงานของ operations นั้นเอง

• มีทางเลือกมากมายที่จะ implement ตัว ADT มันอาจเป็นประโยชน์อย่างมากที่จะเขียน algorithm ที่สามารถใช้กับ implementations ใดๆที่เป็นไปได้

• ADTs ที่รู้จักกันดีอย่าง Stack ADT ที่จะกล่าวถึงในบทนี้ซึ่งเป็น implement ที่ถูกเพิ่มเข้าไปใน standard libraries ดังนั้นเมื่อพวกมันถูกเขียนขึ้นครั้งหนึ่งแล้วก็สามารถนำไปใช้ในหลายๆโปรแกรมได้

• Operations บน ADTs เขียนขึ้นด้วย high-level language เพื่อเจาะจง และกล่าวถึงเฉพาะ algorithm

เมื่อเราพูดเกี่ยวกับเรื่อง ADTs เราจะแยกความแตกต่างของ code ที่ใช้ ADT เรียกว่า client code และจาก code ที่ implement ADT จะเรียกว่า provider code

18.2 The Stack ADT

ในบทนี้เราจะมาดูตัว ADT ทั่วไปอย่าง stack ซึ่ง stack ก็คือ collection ซึ่งหมายถึงโครงสร้างข้อมูลที่เก็บ element ได้หลายตัว collections อื่นๆที่เราได้เห็นมาแล้วก็คือ dictionaries และ lists

ADT ถูกจำกัดความโดย operation ที่แสดงอยู่บนตัวมัน ซึ่งจะเรียกว่า interface ซึ่ง interface ของ stack ประกอบด้วย operations ดังนี้

__init__ : Initialize a new empty stack.

push: Add item ใหม่ลง stack.

pop: Remove และ return item จาก stack ซึ่ง item ที่ return ออกมานั้นจะเป็น item ตัวสุดท้ายที่ได้ add เข้าไป

isEmpty: ตรวจสอบว่า stack ว่างเปล่าหรือไม่

บางครั้ง stack จะถูกเรียกว่าโครงสร้างข้อมูลแบบ LIFO คือ เข้าหลังออกก่อนเพราะ item ที่ add เข้าไปหลังสุดจะถูก remove ก่อน

18.3 Implementing stacks with Python lists

Operations ของ list ในภาษา Python จะมีลักษณะคล้ายกับ operations ของ stack แต่ส่วน interface ไม่เหมือนกันซะทีเดียวแต่เราสามารถเขียน code เพื่อแปลง Stack ADT มาเป็น built-in operations ได้

ตัว code นี้เราเรียกว่า implementation ของ Stack ADT โดยทั่วๆไป implementation คือกลุ่มของ method ตอบสนอง syntactic และ semantic ความต้องการของ interface

นี่คือ implementation ของ Stack ADT ที่ใช้ Python list

class Stack :

def __init__(self) :

self.items = []

def push(self, item) :

self.items.append(item)

def pop(self) :

return self.items.pop()

def isEmpty(self) :

return (self.items == [])

Stack object มี attribute ที่ชื่อว่า items ก็คือ list ของ items ใน stack โดย initialization method จะกำหนดให้ items เป็น list ที่ว่างเปล่า

เพื่อใส่ item ใหม่ลงใน stack จะใช้ append method ของ list เพื่อเพิ่มมันลงไปใน items และเพื่อเรียกข้อมูลออกมาจาก stack จะใช้ pop method ของ list เพื่องดึง item ตัวสุดท้ายออกมา

สุดท้ายคือการตรวจสอบว่า stack นี้ว่างเปล่าหรือไม่ isEmpty จึงเปรียบเทียบว่า items ใช่ list ที่ว่างเปล่าหรือไม่

การ implementation ที่นำเอา method มาประกอบกันอย่างง่ายด้วย method ที่มีอยู่จะเรียกว่า veneer ลองนึกภาพเหมือนในชีวิตจริง veneer คือไม้คุณภาพดีบางๆที่คลุมอยู่บนเฟอร์นิเจอร์เพื่อปกปิดไม้คุณภาพต่ำที่อยู่ภายใน Computer scientists ใช้สิ่งนี้เปรียบเทียบกับในการบรรยายถึง code เล็กๆที่ซ่อนรายละเอียดของ implementation และทำได้ง่ายกว่ามี standard และ interface มากกว่า

18.4 Pushing and popping

Stack คือ generic data structure ซึ่งหมายถึงว่าเราสามารถเพิ่ม item ชนิดใดก็ได้เข้าไปตัวอย่างข้างล่างนี้เป็นการ pushes integers 2 ตัวและ string ใส่ลงไปใน stack

>>> s = Stack()

>>> s.push(54)

>>> s.push(45)

>>> s.push("+")

เราสามารถใช้ isEmpty และ pop เพื่อ remove และพิมพ์ค่า items ทั้งหมดใน stack

while not s.isEmpty() :

print s.pop(),

ผลลัพธ์ก็คือ + 45 54 การพิมพ์ค่าของ stack เป็นการพิมพ์ค่าจากหลังไปหน้ามันไม่ใช่การพิมพ์ที่เป็นรูปแบบมาตราฐานของ list

คุณลองเปรียบเทียบ code ที่ implementation ไปของ printBackward ใน Section 17.4 ทั้ง recursive version ของ printBackward และ algorithm ของ stack ทั้งคู่สามารถเข้ากันได้ ข้อแตกต่างคือ printBackward ใช้ runtime stack ในการติดตาม node ขณะที่มันเข้าไปตรวจสอบใน list และพิมพ์ค่ามาจากด้านหลัง algorithm ของ stack ก็ทำอย่างเดียวกัน เว้นแต่มันใช้ Stack object แทนที่ของ runtime stack

18.5 Using a stack to evaluate postfix

ในภาษาการ programming ส่วนใหญ่ expressions ทางคณิตศาสตร์เขียนโดยใช้ operator อยู่ระหว่าง operands 2 ตัวเช่น 1+2 รูปแบบนี้เรียกว่า infix ยังมีรูปแบบหนึ่งที่ใช้ในการคำนวณเรียกว่า postfix ซึ่งมีรูปแบบเป็น operator ตามหลัง operands เช่น 1 2 +

บางครั้งเหตุผลที่เราใช้ประโยชน์จาก postfix นั้นก็คือการใช้ natural way เพื่อประเมิน postfix expression โดยใช้ stack

• เริ่มต้นจากดูจุดเริ่มต้นของ expression และตรวจสอบดูว่าเป็น operator หรือ operand

- ถ้าเป็น operand ให้ push มันลง stack

- ถ้าเป็น operator ให้ pop ตัว operands 2 ตัวจากออกมาแล้วทำการคำนวณตาม operator แล้วจึง push ผลลัพธ์ลง stack

• เมื่อจบ expression แล้วจะได้ operand เหลืออยู่ 1 ตัวใน stack ซึ่งก็คือผลลัพธ์นั่นเอง

ตัวอย่างที่อธิบายถึงข้อดีของ posfix คือเราไม่ต้องใช้วงเล็บในการควบคุมลำดับการคำนวณเพื่อให้ได้ผลลัพธ์ที่เหมือนกับในรูปแบบ infix เราจะต้องเขียนเป็น (1 + 2) * 3

18.6 Parsing

เพื่อ implement algorithm ก่อนหน้านี้ เราต้องสามารถเข้าไปดูใน string และหยุดเมื่อเจอ operands และ operators และประมวลผล example of parsing และผลลัพธ์ -- the individual chunks of the string -- เรียกว่า tokens คุณอาจจะจำคำนี้ได้ในบทที่ 1

Python ได้เตรียม split method ไว้ใน string และ re (regular expression) modules ฟังก์ชัน string.split จะแบ่ง string เป็น list โดยใช้ single character เป็น delimiter เป็นต้นว่า

>>> import string

>>> string.split("Now is the time"," ")

['Now', 'is', 'the', 'time']

ในกรณีนี้ delimiter คือ space character ดังนั้น string จึงถูกแบ่งตาม space แต่ละอัน

ฟังก์ชัน re.split มีความสามรถมากกว่าโดยเราสามารถใช้ regular expression แทนที่ delimiter การระบุชุดของ string เราจะเรียกว่า regular expression เป็นต้นว่า [A-z] ก็คือชุดของตัวอักษรทั้งหมด [0-9] คือชุดของตัวเลขทั้งหมด ตัว ^ operator แสดงถึงการปฏิเสธ ดังนั้น [^0-9] ก็คือชุดของทุกๆ character แต่ไม่รวมตัวเลข ซึ่งเป็นรูปแบบการใช้งานที่เราต้องการในการใช้แบ่งตัว postfix expressions

>>> import re

>>> re.split("([^0-9])", "123+456*/")

['123', '+', '456', '*', , '/', ]

สังเกตดูลำดับของ arguments ต่างจาก string.split ตัว delimiter จะมาก่อน string

ผลลัพธ์ของ list ที่ได้จะมี operands 123 และ 456 และ operator คือ * และ / รวมไปถึง string ที่ว่างเปล่า 2 ตัว ซึ่งก็คือ “phantom operands” (operands ที่ไม่มีตัวตน) ซึ่งจะเกิดขึ้นเมื่อ operator ปรากฎขึ้นโดยไม่มีตัวเลขอยู่หน้าหรือหลังมัน

18.7 Evaluating postfix

เพื่อประเมินตัว postfix expression เราจะใช้ parser จาก section ที่แล้วและใช้ algorithm จาก section ก่อนหน้านั้น เพื่อให้ง่ายเราจะเริ่มต้นจากการ implement operator + และ * ก่อน

def evalPostfix(expr):

import re

tokenList = re.split("([^0-9])", expr)

stack = Stack()

for token in tokenList:

if token == or token == ' ':

continue

if token == '+':

sum = stack.pop() + stack.pop()

stack.push(sum)

elif token == '*':

product = stack.pop() * stack.pop()

stack.push(product)

else:

stack.push(int(token))

return stack.pop()

เงื่อนไขแรกคือการดู space และ string ที่ว่างเปล่าและ 2 เงื่อนไขถัดมาคือการรองรับ operator ตอนนี้เราสมมติให้สิ่งอื่นๆจะต้องเป็น operand แน่นอนว่ามันจะดีกว่าที่จะตรวจสอบ erroneous input และแสดง error message แต่เราจะทำมันหลังจากนี้

ตอนนี้ลองทดสอบการประเมิน postfix จาก (56+47)*2

>>> print evalPostfix ("56 47 + 2 *")

206

ตอนนี้เราเข้าใกล้เพียงพอแล้ว

18.8 Clients and providers

หนึ่งในเป้าหมายหลักพื้นฐานของ ADT คือการแบ่งออกเป็น provider ผู้ซึ่งเขียน code ที่ implements ตัว ADT และ client คือผู้ที่ใช้ ADT ผู้ที่จะคอยห่วงเกี่ยวกับเรื่องของการ implementation ให้ถูกต้องก็คือ provider ในการระบุความสอดคล้องกันของ ADT และไม่เกี่ยวกับการที่จะใช้งานมันอย่างไร

ตรงข้ามกับ client ที่คิดอยู่ตลอดว่า implementation ของ ADT ถูกต้องและไม่สนใจในเรื่องรายละเอียด เมื่อคุณใช้ Python ‘s built-in types ทำให้คุณรู้สึกได้ว่าคุณคือ client

แน่นอนว่าเมื่อคุณ implement ตัว ADT คุณจะต้องเขียน client code เพื่อทดสอบมัน ในกรณีนี้คุณจะมี 2 บทบาทที่สามารถทำให้คุณสับสนได้ คุณจะต้องพยายามดูว่าคุณเล่นอยู่ในบทบาทใดในเวลาใดๆ

18.9 Glossary

abstract data type (ADT): A data type (usually a collection of objects) that is defined by a set of operations but that can be implemented in a variety of ways.

interface: The set of operations that define an ADT.

implementation: Code that satisfies the syntactic and semantic requirements of an interface.

client: A program (or the person who wrote it) that uses an ADT.

provider: The code (or the person who wrote it) that implements an ADT.

veneer: A class definition that implements an ADT with method definitions that are invocations of other methods, sometimes with simple transformations. The veneer does no significant work, but it improves or standardizes the interface seen by the client.

generic data structure: A kind of data structure that can contain data of any type.

infix: A way of writing mathematical expressions with the operators between the operands.

postfix: A way of writing mathematical expressions with the operators after the operands.

parse: To read a string of characters or tokens and analyze its grammatical structure.

token: A set of characters that are treated as a unit for purposes of parsing, such as the words in a natural language.