-
Notifications
You must be signed in to change notification settings - Fork 603
Separate path and query in URI API #2290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
This is the followup refactoring step for PR #2141 . |
rousskov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a partial review. There are more problems to fix here, but enumerating all of them is too early for this premature PR as this code is likely to change a lot, including in prerequisite PRs.
| return false; | ||
|
|
||
| if (*src != '\r' && *src != '\n' && *src != '\0') | ||
| throw TextException("invalid URL", Here()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A parsing method that reports success/failure via its boolean return value should not throw on invalid input. The two result-reporting designs are mutually exclusive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
um, I don't think that is correct use of the term "mutually exclusive".
This method is in transition towards the state where parsing errors are thrown, and the boolean false means needs-more-data. The bits of logic where you do not see that existing are yet to be refactored away.
| } | ||
| *dst = '\0'; | ||
| chopped = StripAnyWsp(urlpath); | ||
| rfc1738_unescape(urlpath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not add serious features to this bad (on several levels) Uri::parse() code until we refactor it to address its known quality problems. Two recent additions resulted in two serious bugs, and some of those recent changes still need to be polished to reach acceptable quality levels. We should not pile up more serious changes on top of this terrible foundation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You seems to have missed the fact that this PR is part of such a refactor.
Specifically that prior commit ff60b10 left the '?' query delimiter in the path() output (where it got wrongly encoded). After this PR we can review all the previously affected callers knowing that path and query are actually separate or re-combined properly depending on which method is used to access them. I expect to find that they are all correct after this change.
| ++i; // keep track of bytes 'consumed' from src | ||
| /* Then everything from '#' (exclusive) until '\r\n' or '\0' - that's fragment */ | ||
| for (; i < l && *src != '\r' && *src != '\n' && *src != '\0'; ++i, ++src) { | ||
| ; // fragment component is not expected in network protocols. discard for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If official parse() code does not discard fragments, then this PR cannot use a "for now" excuse to start discarding them; most likely, this PR should not discard fragments at all, but we can discuss better excuses if really needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Official code calls it part of "path" component. Which is patently false.
This code splits the components correctly per RFC3986. Keeping the components which are used by ftp, and http(s) URLs.
If you insist on retaining fragment support, I can easily add it back in. Though it would be pointless to store, since it was only actually used by gopher: URLs.
| if (!path().isEmpty()) | ||
| absolutePath_ = Encode(path(), PathChars()); | ||
| if (!query().isEmpty()) { | ||
| absolutePath_.append("?"); | ||
| absolutePath_.append(Encode(query(), QueryChars())); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: This TODO implementation is the fix for bug 5523 titular issue for PR #2289. The rest of this PR is logic supporting this change.
No description provided.