feat: detect trojan source attack #95

simone-sanfratello · 2022-11-09T09:19:44Z

This PR aims to add the trojan source attack detection to the plugin security, using the plugin originally authored by @lirantal https://github.com/lirantal/anti-trojan-source/

Co-authored-by: Luciano Mammino lucianomammino@gmail.com

lmammino · 2022-11-15T10:29:17Z

Hello, any update on this one?

cc: @nlf

nzakas · 2022-11-16T18:34:04Z

Sorry, just swamped right now. I need to dig back through the issue in the ESLint repo to see how this compares to what we discussed here: eslint/eslint#15240 (comment)

nzakas

Thanks for putting this together. I left a bunch of comments.

I also noticed that this only will catch bidi characters in tokens or comments, but they can also occur in the white space of a file, and this rule won't catch those. Can you update the rule to catch those?

nzakas · 2022-11-09T18:26:19Z

rules/detect-trojan-source.js

+// Requirements
+//-----------------------------------------------------------------------------
+
+const { hasTrojanSource } = require('anti-trojan-source');


It seems like all this package does is search for particular character codes defined here:
https://github.com/lirantal/anti-trojan-source/blob/main/src/constants.js

Can we just copy those codes over and avoid an extra dependency (which also includes a CLI)?

@nzakas WDYT about adding this package to devDependencies, so we can keep track of changes more easily (we'll dee them when reviewing new versions) and update our code with new characters if they added them?

Just for context, we might have to update that particular function based on the outcome of the outstanding conversation point below

@MichaelDeBoey that doesn't seem necessary. Bidi characters already exist and are known, so I'm not sure why we'd need to keep track of updates anywhere else.

docs/anti-trojan-source.md

rules/detect-trojan-source.js

nzakas · 2022-11-24T19:00:30Z

rules/detect-trojan-source.js

+          node.tokens.forEach((tokenObject) => {
+            if (tokenObject.value && hasTrojanSource({ sourceText: tokenObject.value })) {
+              context.report({
+                node: node,


In order to correctly report the location of the character, we need to manually pass in the loc, otherwise every report will have a location of line 1, column 0. See https://eslint.org/docs/latest/developer-guide/working-with-rules#contextreport.

Here's an example:
https://github.com/eslint/eslint/blob/main/lib/rules/no-tabs.js#L59-L72

You should also adds checks into the tests to verify that the location is correct.

Because these characters are invisible, having the correct location is really important.

We spent a bit of time working on this one and we realised that the original detect-trojan-source by @lirantal does not give you granularity on where exactly every single malicious unicode character is. On the contrary it just gives you a boolen telling if the given code token or comment block contains at least one of such characters.

Because of this, the location reporting is a bit vague and it just gives users a reference about the location of the affected token. This can be useful, but also misleading (say for example there's a large code block spanning multiple lines).

We changed the code slightly to report the location of the token (and updated tests accordingly).

We could improve this further by changing the hasTrojanSource function to return a list of malicious unicode characters and their offset in the token.

This would make the scan a bit more expensive though, so we are not too sure whether it is a worthwhile approach or not.

There's a third option that might be a decent tradeoff between accuracy and performance. Rather than returning a boolean hasTrojanSource could return the offset of the first occurrence of a malicious unicode character. This way, we will still have one single error per token, but we could at least point the user to the first occurrence of the hidden character. Once the user fixes that, it should be prompter to the next character (if there's more than one per token).

@nzakas what do you think? Do you have any preference?

I like that third option which is sorta fail fast as a good balance for quickly running through the code, if it's costly. Also, would it make any sense to push these updates to the detect-trojan-source package, or is that at all not used here anyway?

Happy to do a PR with any relevant change to detect-trojan-source back to the original repo (as of now we did not change any code coming from there, just copied verbatim the hasTrojanSource and pasted it into the codebase here).

Thanks a lot for the suggestion of option 3. For now my favourite too :)

@lmammino I think you may be overthinking this a bit. In general, people don't like rules that gradually reveal errors -- they want to see them all up front. If it were me, I would expect warnings about every instance of a bidi character in the source code so I'd know how many issues we are talking about.

A good example to follow is the ESLint core rule no-tabs, which does essentially the same thing, only it's looking for tab characters instead of bidi characters.

Solved using a similar approach to no-tabs

rules/detect-trojan-source.js

nzakas · 2022-12-06T20:56:13Z

Just checking back to see if you intend to continue working on this?

simone-sanfratello · 2022-12-07T13:54:33Z

Just checking back to see if you intend to continue working on this?

Yes, we are working on that - in our spare time

lmammino · 2022-12-07T15:34:39Z

Just checking back to see if you intend to continue working on this?

Yes, thanks a lot for all the suggestions. We have allocated some time this week to go through them, so hopefully, we'll have an updated PR by the end of the week.

lmammino · 2022-12-09T09:41:50Z

Hey @nzakas we took some time today to apply all the changes you suggested.

First of all, thank you very much for taking the time to do such an in-depth review. I certainly learned a lot thanks to it.

Secondly, we believe this is in a good state now. There's only a single discussion point that might be worth reconsidering and it's related to reporting the unicode character location.

Let us know what you think about that one so we can move this PR forward.

Thanks again

nzakas · 2022-12-13T01:42:41Z

Thanks. I’ll take a look sometime this week.

nzakas

Thanks again for your work on this. I left a note inline, but also pulling out here: I think it would be better to model this rule after no-tabs in the ESLint core. It is basically the same algorithm only it's looking for tab characters instead of bidi characters.

At a higher level, I think a better name for the rule is detect-bidi-characters, because there may a legitimate reason to use these characters that isn't necessarily a trojan source attack. I also think "trojan source" is a bit of an opaque term that requires reading all of the documentation to understand vs. "bidi characters", which is instantly recognizable.

rules/detect-trojan-source.js

simone-sanfratello · 2022-12-16T11:06:25Z

Thanks again for your work on this. I left a note inline, but also pulling out here: I think it would be better to model this rule after no-tabs in the ESLint core. It is basically the same algorithm only it's looking for tab characters instead of bidi characters.

At a higher level, I think a better name for the rule is detect-bidi-characters, because there may a legitimate reason to use these characters that isn't necessarily a trojan source attack. I also think "trojan source" is a bit of an opaque term that requires reading all of the documentation to understand vs. "bidi characters", which is instantly recognizable.

Thanks @nzakas for the feedback, we've implemented in that way.
Please take a look at tests, they show what we want to achieve.

We're going to complete the PR with the other changes (name, jsdoc) once we agree on the solution.

…ents

lmammino · 2022-12-16T17:34:26Z

Hello @nzakas, we believe we applied all the requested changes, including renaming the rule to detect-bidi-characters. Please let us know if you think this is good to go or if it needs more work.

Thanks for all the awesome suggestions so far!

nzakas

This looks great now. Thanks for being so open to feedback.

I just wanted to double check that you don’t have any other cleanup work you wanted to do before merging?

nzakas · 2022-12-17T01:59:26Z

Please double check the lint errors.

lmammino · 2022-12-17T12:04:19Z

Thanks @nzakas, we believe this is good to go.

PS: just fixed the markdown linting issues as well

nzakas · 2022-12-30T17:38:01Z

Sorry, was away for the holidays. The linting CI is still broken. Can you take a look>

lirantal

Added suggestions to comply with CI linting issues

docs/rules/detect-bidi-characters.md

Co-authored-by: Liran Tal <liran.tal@gmail.com>

nzakas

All right, this is good to go. Thanks so much for your hard work on this.

lmammino · 2023-01-03T08:02:00Z

Thank you @nzakas, @simone-sanfratello and @lirantal 😊

simone-sanfratello marked this pull request as ready for review November 9, 2022 09:22

nzakas requested changes Nov 24, 2022

View reviewed changes

nzakas requested changes Dec 14, 2022

View reviewed changes

rules/detect-trojan-source.js Outdated Show resolved Hide resolved

lmammino and others added 7 commits December 16, 2022 17:40

Resolved conflicts in README

83bc5eb

Embedded anti-trojan-source and removed from dependencies

7467ad8

Expanded README with an example

609ac58

Changed onCodePath with Program

b29b16a

Improved code style as suggested in review

f9b2efe

Added rough location estimation in the error report

ea58be4

feat: implement exact location for each bidi char

c621341

lmammino force-pushed the feat/anti-trojan-charset branch from c0c546e to c621341 Compare December 16, 2022 16:41

lmammino added 2 commits December 16, 2022 18:26

Renamed to detect-bidi-characters and fixed first line offset in comm…

53d3456

…ents

Added JSDoc in detectBidiCharacters

f398b8f

simone-sanfratello requested a review from nzakas December 16, 2022 17:32

nzakas approved these changes Dec 17, 2022

View reviewed changes

Fixed MD034 - Bare URL used

e8e145e

lirantal reviewed Dec 30, 2022

View reviewed changes

Apply suggestions from code review

7d3f843

Co-authored-by: Liran Tal <liran.tal@gmail.com>

nzakas approved these changes Jan 2, 2023

View reviewed changes

nzakas merged commit 4294d29 into eslint-community:main Jan 2, 2023

lmammino deleted the feat/anti-trojan-charset branch January 3, 2023 08:01

ota-meshi mentioned this pull request Jan 12, 2023

Including "unicode bidi attacks" defense #72

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: detect trojan source attack #95

feat: detect trojan source attack #95

simone-sanfratello commented Nov 9, 2022 •

edited

lmammino commented Nov 15, 2022

nzakas commented Nov 16, 2022

nzakas left a comment

nzakas Nov 9, 2022

lmammino Dec 9, 2022

MichaelDeBoey Dec 11, 2022

lmammino Dec 14, 2022

nzakas Dec 14, 2022

nzakas Nov 24, 2022

lmammino Dec 9, 2022 •

edited

lirantal Dec 9, 2022

lmammino Dec 9, 2022

nzakas Dec 14, 2022

lmammino Dec 16, 2022

nzakas commented Dec 6, 2022

simone-sanfratello commented Dec 7, 2022

lmammino commented Dec 7, 2022

lmammino commented Dec 9, 2022

nzakas commented Dec 13, 2022

nzakas left a comment

simone-sanfratello commented Dec 16, 2022

lmammino commented Dec 16, 2022

nzakas left a comment

nzakas commented Dec 17, 2022

lmammino commented Dec 17, 2022

nzakas commented Dec 30, 2022

lirantal left a comment

nzakas left a comment

lmammino commented Jan 3, 2023

feat: detect trojan source attack #95

feat: detect trojan source attack #95

Conversation

simone-sanfratello commented Nov 9, 2022 • edited

lmammino commented Nov 15, 2022

nzakas commented Nov 16, 2022

nzakas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lmammino Dec 9, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nzakas commented Dec 6, 2022

simone-sanfratello commented Dec 7, 2022

lmammino commented Dec 7, 2022

lmammino commented Dec 9, 2022

nzakas commented Dec 13, 2022

nzakas left a comment

Choose a reason for hiding this comment

simone-sanfratello commented Dec 16, 2022

lmammino commented Dec 16, 2022

nzakas left a comment

Choose a reason for hiding this comment

nzakas commented Dec 17, 2022

lmammino commented Dec 17, 2022

nzakas commented Dec 30, 2022

lirantal left a comment

Choose a reason for hiding this comment

nzakas left a comment

Choose a reason for hiding this comment

lmammino commented Jan 3, 2023

simone-sanfratello commented Nov 9, 2022 •

edited

lmammino Dec 9, 2022 •

edited