-
Notifications
You must be signed in to change notification settings - Fork 417
Description
Hi Anthony,
Thank you for the wonderful book.
I'm working through Ch. 13 on regular expressions on Windows10 using postgreSQL10 in pgAdmin4
I was having trouble getting the regular expressions to work for example in code listing 13-7.
I believe that the issue is related to the way new line characters are handled on windows.
This also may be related to the following issues (I am using a clean version of the imported csv file from the crime data):
I was able to solve this issue with this SO answer: https://stackoverflow.com/a/20056634. Apparently windows may match newlines to \r\n
Here is my sql code for the crime time and the output, where crime_type_orig is the original from the book and the other crime_type2 and crime_type3 are based on the above SO answer:
select
regexp_match(original_text, '\n(?:\w+ \w+|\w+)\n(.*):') as crime_type_orig,
-- See https://stackoverflow.com/a/20056634
regexp_match(original_text, '\r\n(?:\w+ \w+|\w+)\r\n(.*):') as crime_type2,
-- Based on https://stackoverflow.com/a/20056634
regexp_match(original_text, '(?:\r\n|\r|\n)(?:\w+ \w+|\w+)(?:\r\n|\r|\n)(.*):') as crime_type3
from crime_reports;Here is the output from pgAdmin
