accy570lab_week11

html

School

University of Illinois, Urbana Champaign *

*We aren’t endorsed by this school

Course

570

Subject

Computer Science

Date

Dec 6, 2023

Type

html

Pages

5

Uploaded by HighnessLoris3604

Report
Week 11 Lab Python Regular Expression A regex is a sequence of characters that defines a search pattern, used mainly for performing find and replace operations in search engines and text processors. In [1]: import re Raw Strings Different functions in Python's re module use raw string as an argument. A normal string, when prefixed with 'r' or 'R' becomes a raw string. The difference between a normal string and a raw string is that the normal string in print() function translates escape characters (such as \n, \t etc.) if any, while those in a raw string are not. In [2]: print('Hello\nWorld') Hello World In [3]: print(r'Hello\nWorld') Hello\nWorld In [4]: '\d' Out[4]: '\\d' Meta Characters Some characters carry a special meaning when they appear as a part pattern matching string. . ^ $ * + ? [ ] \ | ( ) { } Pattern Description \d Matches any decimal digit; this is equivalent to the class [0- 9] \D Matches any non-digit character \s Matches any whitespace character \S Matches any non-whitespace character \w Matches any alphanumeric character \W Matches any non-alphanumeric character \b Boundary that's not \w . Matches with any single character except newline ‘\n' ? match 0 or 1 occurrence of the pattern to its left
Pattern Description + 1 or more occurrences of the pattern to its left * 0 or more occurrences of the pattern to its left [...] Matches any single character in a square bracket {n,m} Matches at least n and at most m occurrences of preceding a|b Matches a or b re.findall(pattern, string) Searches for all occurance of the pattern and returns a list of all occurrences. In [5]: s = "We watched Star Wars on Wednesday night at 8:30." In [6]: re.findall(r'.', s) Out[6]: ['W', 'e', ' ', 'w', 'a', 't', 'c', 'h', 'e', 'd', ' ', 'S', 't', 'a', 'r', ' ', 'W', 'a', 'r', 's', ' ', 'o', 'n', ' ', 'W', 'e', 'd', 'n', 'e', 's', 'd',
'a', 'y', ' ', 'n', 'i', 'g', 'h', 't', ' ', 'a', 't', ' ', '8', ':', '3', '0', '.'] In [7]: re.findall(r'.*', s) Out[7]: ['We watched Star Wars on Wednesday night at 8:30.', ''] In [8]: re.findall(r'.+', s) Out[8]: ['We watched Star Wars on Wednesday night at 8:30.'] In [9]: re.findall(r'\w+', s) Out[9]: ['We', 'watched', 'Star', 'Wars', 'on', 'Wednesday', 'night', 'at', '8', '30'] In [10]: re.findall(r'[a-zA-Z0-9]+', s) Out[10]: ['We', 'watched', 'Star', 'Wars', 'on', 'Wednesday', 'night', 'at', '8', '30'] In [11]: re.findall(r'\d+', s) Out[11]: ['8', '30'] In [12]: re.findall(r'[0-9]+', s) Out[12]: ['8', '30'] In [13]: re.findall(r'[Ww]\w+', s) Out[13]: ['We', 'watched', 'Wars', 'Wednesday'] Exercise 1 Find all words start with a capital letter in s. In [9]: s Out[9]: 'We watched Star Wars on Wednesday night at 8:30.'
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
In [12]: re.findall(r'[A-Z]\w+', s) Out[12]: ['We', 'Star', 'Wars', 'Wednesday'] In [15]: s1='I watched Star Wars on Wednesday night at 8:30.' re.findall(r'[A-Z]\w*', s1) Out[15]: ['I', 'Star', 'Wars', 'Wednesday'] In [11]: re.findall(r'[S]\w+', s) Out[11]: ['Star'] Exercise 2 Find time in s. In [19]: re.findall(r'\d{1,2}:\d{1,2}', s) Out[19]: ['8:30'] In [20]: re.findall(r'\d:\d+', s) Out[20]: ['8:30'] Exercise 3 Find all emails in the string myprofile. In [25]: myprofile = '''My school email is zllu2@illinois.edu and my personal email is lindenlu@gmail.com. My school phone number is 300-5970 and my cell phone is 417-2459. My zip code is 61820-0001''' print(myprofile) My school email is zllu2@illinois.edu and my personal email is lindenlu@gmail.com. My school phone number is 300-5970 and my cell phone is 417-2459. My zip code is 61820-0001 In [26]: re.findall(r'\w+@\w+[.]\w+', myprofile) Out[26]: ['zllu2@illinois.edu', 'lindenlu@gmail.com'] Limited Repeatition and Boundary {m} {m,n} \b In [27]: s Out[27]: 'We watched Star Wars on Wednesday night at 8:30.' In [28]: re.findall(r'\w{2}', s) Out[28]:
['We', 'wa', 'tc', 'he', 'St', 'ar', 'Wa', 'rs', 'on', 'We', 'dn', 'es', 'da', 'ni', 'gh', 'at', '30'] In [29]: s Out[29]: 'We watched Star Wars on Wednesday night at 8:30.' In [30]: re.findall(r'\w{2}\b', s) Out[30]: ['We', 'ed', 'ar', 'rs', 'on', 'ay', 'ht', 'at', '30'] In [20]: re.findall(r'\b\w{2}\b', s) Out[20]: ['We', 'on', 'at', '30'] In [21]: re.findall(r'\b\w{1,4}\b', s) Out[21]: ['We', 'Star', 'Wars', 'on', 'at', '8', '30'] Exercise 4 Find all phone numbers in myprofile. In [31]: print(myprofile) My school email is zllu2@illinois.edu and my personal email is lindenlu@gmail.com. My school phone number is 300-5970 and my cell phone is 417-2459. My zip code is 61820-0001 In [37]: re.findall(r'\b\d{3}-\d{4}\b', myprofile) Out[37]: ['300-5970', '417-2459']