Update to Mozilla's new V8 regexp parser #1675

Open
opened 2 years ago by Moonchild · 10 comments
Owner

Bounty material.

We are using and old and heavily modified version of irregexp at the moment (which has had new features added like dotAll), however the implementation is difficult to maintain and expand due to subtle issues in object handling as evidenced by the hard to trace issue in my lookbehind implementation.

Mozilla ran into the same and chose to import a new version of irregexp from Google Chrome and using it as-is with plumbing built around it to make SpiderMonkey being able to deal with the code.
We should do the same, considering manually implementing the irregexp code updates in our version of irregexp will likely need an updated version anyway, and rewriting that to be similar to what we have now would likely be even more work.

This is a somewhat large code porting project but should be fairly straightforward.

This is fairly self-contained to this meta bug and what is listed as dependent tasks in that meta.


BMO Reference (Meta): Bug 1367105

**Bounty material.** We are using and old and heavily modified version of irregexp at the moment (which has had new features added like dotAll), however the implementation is difficult to maintain and expand due to subtle issues in object handling as evidenced by the hard to trace issue in my lookbehind implementation. Mozilla ran into the same and chose to import a new version of irregexp from Google Chrome and using it as-is with plumbing built around it to make SpiderMonkey being able to deal with the code. We should do the same, considering manually implementing the irregexp code updates in our version of irregexp will likely need an updated version anyway, and rewriting that to be similar to what we have now would likely be even more work. This is a somewhat large code porting project but should be fairly straightforward. This is fairly self-contained to this meta bug and what is listed as dependent tasks in that meta. ---- BMO Reference (Meta): [Bug 1367105](https://bugzilla.mozilla.org/show_bug.cgi?id=1367105)
Moonchild added the
Bounty
Javascript
Enhancement
Web Compatibility
labels 2 years ago

Since we are going to vaugely follow Mozilla progression this issue is meta. As such, all dependancies are potental bounty materal. I suppose that means the more you do the more you get. Just be sure to use the Issue Tracker depenancy system since that is a thing now.

I am thinking this is going to be a coorperative effort as it will require a general swath of skillsets and experience. As such I am going to be coordinating and directing. Overall progress can be discussed here but dependant issues and discussions regarding them should be there.

Since I am the build system peer I will do inital import and add build system bits. Others who may want and would be the first choice for some parts would be @g4jc and potentally @athenian200

Since we are going to vaugely follow Mozilla progression this issue is meta. As such, all dependancies are potental bounty materal. I suppose that means the more you do the more you get. Just be sure to use the Issue Tracker depenancy system since that is a thing now. I am thinking this is going to be a coorperative effort as it will require a general swath of skillsets and experience. As such I am going to be coordinating and directing. Overall progress can be discussed here but dependant issues and discussions regarding them should be there. Since I am the build system peer I will do inital import and add build system bits. Others who may want and would be the first choice for some parts would be @g4jc and potentally @athenian200
Ghost changed title from Import and use V8's updated regexp parser to Update to Mozilla's updated (and modified) V8 regexp parser 2 years ago
Ghost changed title from Update to Mozilla's updated (and modified) V8 regexp parser to Update to Mozilla's updated V8 regexp parser 2 years ago
Ghost added a new dependency 2 years ago
Ghost removed a dependency 2 years ago
Ghost changed title from Update to Mozilla's updated V8 regexp parser to Update to Mozilla's new V8 regexp parser 2 years ago

So in Issue #1677 the files and build config was imported. It does not build. My best guess is at LEAST Bug 1592307 will need done before it will compile.

So in Issue #1677 the files and build config was imported. It does not build. My best guess is at LEAST [Bug 1592307](https://bugzilla.mozilla.org/show_bug.cgi?id=1592307) will need done before it will compile.

Here is a list of the bugs and their landing order.

# 1592302 - 2019-12-3
# 1592307 - 2019-12-12
# 1620020 - First four 2020-03-13 / Last Eleven 2020-03-14
# 1624015 - 2020-03-24
# 1594545 - First four 2020-03-28 / Next four 2020-03-30 / 96331c813c4b on 2020-03-31 / Last four on 2020-04-01

# 1626713 - 2020-04-02 -- First bug that hooks it up

# 1627838 - 2020-04-09
# 1628835 - 2020-04-16 ! Check exact landing order
# 1629670 - 2020-04-16 ! Check exact landing order
# 1630090 - 2020-04-17
# 1630383 - 2020-04-20 ! Check exact landing order
# 1631504 - 2020-04-20 ! Check exact landing order
# 1635275 - 2020-05-12
# 1634810 - 2020-05-15
Here is a list of the bugs and their landing order. ``` # 1592302 - 2019-12-3 # 1592307 - 2019-12-12 # 1620020 - First four 2020-03-13 / Last Eleven 2020-03-14 # 1624015 - 2020-03-24 # 1594545 - First four 2020-03-28 / Next four 2020-03-30 / 96331c813c4b on 2020-03-31 / Last four on 2020-04-01 # 1626713 - 2020-04-02 -- First bug that hooks it up # 1627838 - 2020-04-09 # 1628835 - 2020-04-16 ! Check exact landing order # 1629670 - 2020-04-16 ! Check exact landing order # 1630090 - 2020-04-17 # 1630383 - 2020-04-20 ! Check exact landing order # 1631504 - 2020-04-20 ! Check exact landing order # 1635275 - 2020-05-12 # 1634810 - 2020-05-15 ```

As noted in the recent issues the plan changed somewhat but we are in the phase of making it build before we start hooking it up to the js engine.

Do catch up people.

As noted in the recent issues the plan changed somewhat but we are in the phase of making it build before we start hooking it up to the js engine. Do catch up people.
Poster
Owner

I'm having another earnest stab at this but i'm running into one particular error I can't seem to fix because i don't even exactly understand how this plumbing works.

 0:33.57 c:\mozdev\PaleMoon\platform\js\src\regexp/regexp-parser.h(330): error C2784: 'bool operator <(const void *,const SharedMem<T> &)': could not deduce template argument for 'const SharedMem<T> &' from 'const v8::internal::ZoneVector<v8::uc16>'
 0:33.57 c:\mozdev\palemoon\platform\js\src\vm/SharedMem.h(220): note: see declaration of 'operator <'
 0:33.57 c:\mozdev\PaleMoon\platform\js\src\regexp/regexp-parser.h(330): error C2676: binary '<': 'const v8::internal::ZoneVector<v8::uc16>' does not define this operator or a conversion to a type acceptable to the predefined operator
I'm having another earnest stab at this but i'm running into one particular error I can't seem to fix because i don't even exactly understand how this plumbing works. ``` 0:33.57 c:\mozdev\PaleMoon\platform\js\src\regexp/regexp-parser.h(330): error C2784: 'bool operator <(const void *,const SharedMem<T> &)': could not deduce template argument for 'const SharedMem<T> &' from 'const v8::internal::ZoneVector<v8::uc16>' 0:33.57 c:\mozdev\palemoon\platform\js\src\vm/SharedMem.h(220): note: see declaration of 'operator <' 0:33.57 c:\mozdev\PaleMoon\platform\js\src\regexp/regexp-parser.h(330): error C2676: binary '<': 'const v8::internal::ZoneVector<v8::uc16>' does not define this operator or a conversion to a type acceptable to the predefined operator ```
Poster
Owner

Well whaddayaknow. it does seem like 738448aa0d actually solved that particular build error after all. I thought initially it didn't, which caused my confusion. Clobbering to verify.
Never mind, it was just bumped out of view by another error.

~~Well whaddayaknow. it does seem like 738448aa0d23fae91dbb5ad06cdeea1a96038c5e actually solved that particular build error after all. I thought initially it didn't, which caused my confusion. Clobbering to verify.~~ Never mind, it was just bumped out of view by another error.
Poster
Owner

ignore my past 2 comments here I've already run into that issue in #1679 and am no closer to solving it.
I am working on solving other surrounding stuff though as much as I can without a completed build.

ignore my past 2 comments here I've already run into that issue in #1679 and am no closer to solving it. I am working on solving other surrounding stuff though as much as I can without a completed build.
Poster
Owner

Build issues were solved in #1679 (now closed) so we can continue here.

Build issues were solved in #1679 (now closed) so we can continue here.
Poster
Owner

We'll be skipping all BinJS/Binary AST hookups because we do not have (nor intend to have!) a Binary AST parser in our engine.

We'll be skipping all BinJS/Binary AST hookups because we do not have (nor intend to have!) a Binary AST parser in our engine.
Moonchild self-assigned this 1 year ago
Poster
Owner

Split the list from #1800 out into two parts, the hooking up part left in #1800 and the rest pasted here for now.

  • [IDB-Refact] Bug 1623278 - Prototype reduction of raw pointer usage
    • [IDB-Refact] Bug 1633719 - Add support for NotNull<SafeRefPtr> and use NotNull with InitializedOnce
  • [Multi-backout-reland] Bug 1634135 - Enable new regexp engine
  • Bug 1635275 - Add target for differential fuzzing
  • Bug 1637199 - Build failure due to #include picking up js/src/new-regexp/VERSION on macOS (unknown type name 'Imported', expected unqualified-id)
  • Bug 1637203 - Perma js/src/new-regexp/regexp-shim.cc:59:3: error: 'SprintfLiteral' was not declared in this scope when Gecko 78 merges to Beta on 2020-06-01
  • Bug 1636495 - Fix error handling in nsContentUtils::IsPatternMatching
    • Seems to be a pre-req for re-landing 1634135
  • Bug 1637189 - Crash [@ ??] through [@ js::RegExpShared::execute] or Assertion failure: start + length <= baseArg->length(), at vm/StringType-inl.h:207
  • !-- Reland 1634135 for the final time
  • Bug 1637977 - ThreadSanitizer: data race [@ emplace<>] vs. [@ isNothing] with memory corruption
  • [Bug 1637913 - Crash [@ v8::internal::ActionNode::StorePosition] or Assertion failure: (RegExpMacroAssembler::kMaxRegister) >= (next_register_](https://bugzilla.mozilla.org/show_bug.cgi?id=1637913 - Crash [@ v8::internal::ActionNode::StorePosition] or Assertion failure: (RegExpMacroAssembler::kMaxRegister) >= (next_register_) - 1), at new-regexp/regexp-compiler.cc:236
  • Bug 1362154 - Implement RegExp named groups
  • Bug 1638154 - Assertion failure: is_uint24(cp_offset + eats_at_least), at new-regexp/regexp-bytecode-generator.cc:183
  • Bug 1640479 - Named capture groups left undefined after OOM in RegExpShared::initializeNamedCaptures
  • [Reimport] Bug 1641352 - GC-free regexp parsing
  • [Replace] Bug 1642493 - Tidy up loose ends
  • Bug 1635275 - Irregexp: Add target for differential fuzzing - landed 2020-05-12
  • Bug 1634810 - Update test262 blacklist for new RegExp features - landed 2020-05-15

The bottom two (Bug 1635275, Bug 1634810) are likely uninteresting for us and can probably be skipped.

Split the list from #1800 out into two parts, the hooking up part left in #1800 and the rest pasted here for now. - [ ] [IDB-Refact] [Bug 1623278](https://bugzilla.mozilla.org/show_bug.cgi?id=1623278) - Prototype reduction of raw pointer usage - [ ] [IDB-Refact] [Bug 1633719](https://bugzilla.mozilla.org/show_bug.cgi?id=1633719) - Add support for NotNull<SafeRefPtr<T>> and use NotNull with InitializedOnce - [ ] [Multi-backout-reland] [Bug 1634135](https://bugzilla.mozilla.org/show_bug.cgi?id=1634135) - Enable new regexp engine - [ ] [Bug 1635275](https://bugzilla.mozilla.org/show_bug.cgi?id=1635275) - Add target for differential fuzzing - [ ] [Bug 1637199](https://bugzilla.mozilla.org/show_bug.cgi?id=1637199) - Build failure due to #include <version> picking up js/src/new-regexp/VERSION on macOS (unknown type name 'Imported', expected unqualified-id) - [ ] [Bug 1637203](https://bugzilla.mozilla.org/show_bug.cgi?id=1637203) - Perma js/src/new-regexp/regexp-shim.cc:59:3: error: 'SprintfLiteral' was not declared in this scope when Gecko 78 merges to Beta on 2020-06-01 - [ ] [Bug 1636495](https://bugzilla.mozilla.org/show_bug.cgi?id=1636495) - Fix error handling in nsContentUtils::IsPatternMatching - Seems to be a pre-req for re-landing 1634135 - [ ] [Bug 1637189](https://bugzilla.mozilla.org/show_bug.cgi?id=1637189) - Crash [@ ??] through [@ js::RegExpShared::execute] or Assertion failure: start + length <= baseArg->length(), at vm/StringType-inl.h:207 - [ ] !-- Reland 1634135 for the final time - [ ] [Bug 1637977](https://bugzilla.mozilla.org/show_bug.cgi?id=1637977) - ThreadSanitizer: data race [@ emplace<>] vs. [@ isNothing] with memory corruption - [ ] [Bug 1637913 - Crash [@ v8::internal::ActionNode::StorePosition] or Assertion failure: (RegExpMacroAssembler::kMaxRegister) >= (next_register_](https://bugzilla.mozilla.org/show_bug.cgi?id=1637913 - Crash [@ v8::internal::ActionNode::StorePosition] or Assertion failure: (RegExpMacroAssembler::kMaxRegister) >= (next_register_) - 1), at new-regexp/regexp-compiler.cc:236 - [ ] [Bug 1362154](https://bugzilla.mozilla.org/show_bug.cgi?id=1362154) - Implement RegExp named groups - [ ] [Bug 1638154](https://bugzilla.mozilla.org/show_bug.cgi?id=1638154) - Assertion failure: is_uint24(cp_offset + eats_at_least), at new-regexp/regexp-bytecode-generator.cc:183 - [ ] [Bug 1640479](https://bugzilla.mozilla.org/show_bug.cgi?id=1640479) - Named capture groups left undefined after OOM in RegExpShared::initializeNamedCaptures - [ ] [Reimport] [Bug 1641352](https://bugzilla.mozilla.org/show_bug.cgi?id=1641352) - GC-free regexp parsing - [ ] [Replace] [Bug 1642493](https://bugzilla.mozilla.org/show_bug.cgi?id=1642493) - Tidy up loose ends - [ ] Bug 1635275 - Irregexp: Add target for differential fuzzing - landed 2020-05-12 - [ ] Bug 1634810 - Update test262 blacklist for new RegExp features - landed 2020-05-15 The bottom two (Bug 1635275, Bug 1634810) are likely uninteresting for us and can probably be skipped.
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date

No due date set.

Reference: MoonchildProductions/UXP#1675
Loading…
There is no content yet.