Recently in the study of CodeQL, CodeQL will not be introduced, the current online search a large number. This series is to learn CodeQL personal study notes, according to personal knowledge base notes modified and organized, share out common learning. Personally, I think the syntax of QL is rather anti-human, at least compared with the current mainstream OOP languages, there is still a certain degree of difficulty. Unlike most of the so-called CodeQL tutorials on the web, this series is based on theofficial documentcap (a poem)Examples of scenariosIt contains a lot of personal understanding, thinking and extension, straight to the point, only cut to the chase, almost no nonsense, and insists on summarizing and summarizing with learning from each example, and then verifying in the examples. I hope to give you a little different insights and ideas. Of course, this is also bound to contain certain errors, I hope that you can correct the big brother in the comments section.
Let's start by looking at some basic concepts and structures
// fabric
from /* variable declarations */
where /* logical formulas */
select /* expressions */
from int a, int b
where x = 3, y = 4
select x, y
// find out1-10indices of a hook number (math.)
from int x, int y, int z
where x in [1..10], y in [1..10], z in [1..10] and
x * x + y * y = z * z
select x, y, z
// Or the following type of writing,Encapsulation and method reuse
class SmallInt extends int {
SmallInt(){
this in [1..10]
}
int square(){
result = this * this
}
}
from SmallInt x, SmallInt y, SmallInt z
where () + () = ()
select x, y, z
Logical connectives, quantifiers, aggregates
from Person p
where () = max(int i | exists(Person t | () = i ) | i) // Generic aggregation syntax, more verbose
select p
// Or use the following ordered aggregation
select max(Person p | | p order by ())
exists(<variable declaration> | <conditional expression>)
<aggregates> ( <Variable declaration> | <Logical expression (restricts the range of eligible data)> | <Expression (returns filtered)> )
.
exists( Person p | () = "test" )
, to determine whether there exists a person with the name test
max(int i | exists(Person p | () = i) | i)
, the second part means getting everyone's age to put into i, and the third part is scoped to i, which is currently an int array holding everyone's age, i.e., the final calculation of themax(i)
select max(Person p | | p order by ())
, consider each person and take out the person who is the oldest. The process is to take the maximum value according to the age, in other wordsorder by ()
is telling the max() function to find the maximum value based on getAge(), and does not involve sorting all objects.
// Other ordered aggregation exercises
select min(Person p | () = "east" | p order by ()) // shortest person east of village
select count(Person p | () = "south" | p) // number of people south of the village
select avg(Person p | | ()) // average height of villagers
select sum(Person p | () = "brown" | ()) // sum of ages of all villagers with brown hair
// General exercise, /docs/writing-codeql-queries/find-the-thief/#the-real-investigation
import tutorial
from Person p
where
() > 150 and // height over 150
not () = "blond" and // hair color not blond
exists(string c | () = c) and // not blond. This means that a certain hair color exists for this person, but it doesn't have to be visualized.
not () < 30 and // Age 30 or older. It could also be () > = 30
() = "east" and // lives on the east side of town
( () = "black" or () = "brown" ) and // has black or brown hair
not (() > 180 and () < 190) and // no (over 180 and shorter than 190)
exists(Person t | () > ()) and // not the oldest person. The existential syntax used here is that there exists a person older than his age
exists(Person t | () > ()) and // is not the tallest person
() < avg(Person t | | ()) and // is shorter than average height. Everyone, no restriction on the range
p = max(Person t | () = "east" | t order by ()) // The oldest person in the east. This line is the official reference given, but the official documentation says "Note that if there are several people with the same maximum age, the query lists all of them.", which could have uncontrollable consequences if there are two people with the same maximum age who would be listed at the same time.
// () = max(Person t | () = "east" | ()) // As per personal understanding and chatgpt's answer, this should be used
select p
Predicates and categories
Predicates in CodeQL can probably be interpreted as functions in other high-level programming languages, which also have passable parameters, return values, reusability, and other properties.
Let's start with a simple example
import tutorial
predicate isSouthern(Person p) {
() = "south"
}
from Person p
where isSouthern(p)
select p
Here the predicate for a logical conditional judgment, return true or false, some similar to boolean, of course, ql has a separate type of boolean, there is a certain difference, just to understand the understanding can be linked to the understanding of the first not to be unfolded here
Predicates are defined in a similar way to functions, where predicate can be replaced with a return result type, e.g.int getAge() { result = xxx }
Predicate names can only begin with a lowercase letter.
In addition, a new class can be defined that directly contains the isSouthern people
class Southerner extends Person {
Southerner() { isSouthern(this) }
}
from Southerner s
select s
This is similar to a class definition in an object-oriented language (OOL), which also has inheritance, encapsulation, methods, etc.; here the
Southerner()
Similar to a constructor, but unlike a constructor in a class, this is a logical attribute and does not create an object. methods in the ool class are called class member predicates in qldisplayed formula
isSouthern(this)
defines the logical attributes of this class, calledcharacteristic predicate (math.)
He used a variablethis
(where this is understood to be the same as ool) means: if the propertyisSouthern(this)
holds, then aPerson
--this
anSoutherner
. A simple way to understand this is that the characteristic predicate of each inherited subclass in ql represents theWhat kind of parent class is a subclass of my kind
、What other features/characteristics does a subclass like mine have over the parent class
To quote the official documentation: Classes in QL represent a logical property: a value is a member of a class when it satisfies this property. This means that a value can belong to many classes - belonging to a particular class does not prevent it from belonging to other classes.
Take a look at the following example
class Child extends Person {
Child(){
() < 10
}
override predicate isAllowedIn(string region) {
region = ()
}
}
// The isAllowedIn implementation in the Person parent class is as follows:
predicate isAllowedIn(string region) { region = ["north", "south", "east", "west"] }
// The parent isAllowedIn(region) method always returns true, the child returns true only if it is the current region (getLocation() method)
See a complete example
import tutorial
predicate isSoutherner(Person p) {
() = "south"
}
class Southerner extends Person {
Southerner(){isSoutherner(this)}
}
class Child extends Person {
Child(){() < 10}
override predicate isAllowedIn(string region) {
region = ()
}
}
from Southerner s
where ("north")
select s, ()
There is a concept here that is very important to completely differentiate from ool's class, in ool's class, refactored methods in inherited subclasses do not affect other inherited subclasses, and each subclass doesn't need to think about whether or not they are interleaved. But in QL, to quote the official documentationClasses in QL represent a logical property: a value is a member of a class when it satisfies that property. This means that a value can belong to many classes - belonging to a particular class does not prevent it from belonging to other classes!
, in every subclass of ql that satisfies its characteristic predicate, is a member of that subclass.
For this specific example in the code above, if someone in Person satisfies both the Southerner and Child characterization relations, they belong to both classes and naturally inherit the member predicates from them.
Personally, I understand that subclassing in QL is actually taking all of the parent class, and then matching certain elements in the parent class based on feature predicates, and then rewriting/refactoring the member predicates of those elements in it, which in fact modifies the elements in the parent class. Here are three examples to compare and contrast
// Take all the persons that are currently in South, and then take the ones that can go to North. Since the children are restricted to stay in the local area, none of the children in the Southerner are able to go to the north, so they are filtered out.
from Southerner s
where ("north")
select s
// Take out all the children, so they can only stay where they are, so finding who can go to north is finding who was originally in north
from Child c
where ("north")
select c
// Take all Person, and find who can go to north, i.e. all adults (by default everyone can go to all areas) and Child who is already in north
from Person p
where ("north")
select p
By extension, if multiple subclasses simultaneously refactoroverride the same member predicate, then the following rule is followed (assuming for the moment that there are three classes A, B, and C) (summarized later):
- Assume that A is a parent class, i.e., some member predicate in it
test()
There is no override, B and C inherit from A at the same time and both override A'stest()
Member predicates.
- If the predicate type of from is A, where the
test()
Methods will be rewritten by B and C in their entirety. The parts of B and C that overlap when encountered do not conflict and remain coexisting- If the predicate type of from is B or C, then B/C is used as the basis, and the overlap with the other is added to satisfy the condition of B/C, without conflict, to maintain coexistence
- If A is the parent class, B inherits A, and C inherits B, C overrides the same member predicates in B instead of coexisting with them
- For multiple inheritance, C inherits both A and B. If there is an overlap in the member predicates of A and B, then C must OVERRIDE the predicates.
Example:
class OneTwoThree extends int { OneTwoThree() { // characteristic predicate (math.) this = 1 or this = 2 or this = 3 } string getAString() { // member predicate result = "One, two or three: " + () } } class OneTwo extends OneTwoThree { OneTwo() { this = 1 or this = 2 } override string getAString() { result = "One or two: " + () } } from OneTwoThree o select o, () /* result: o getAString() result 1 One or two: 1 2 One or two: 2 3 One, two or three: 3 // understandings:onetwothreeclass defines the1 2 3,onetworestructuredonetwothreecenter1cap (a poem)2的member predicate。consequentlyonetwothree ocenter有3classifier for individual things or people, general, catch-all classifier,其center的1cap (a poem)2utilizationonetwo的member predicate,3utilizationonetwothree的member predicate */
Scenario 1: Add another category to this one (important), A->B, A->C
class TwoThree extends OneTwoThree{ TwoThree() { this = 2 or this = 3 This = 2 or this = 3 } override string getAString() { result = "Two or three: " + () } } /*} command. from OneTwoThree o select o, () result: o getAString() result o getAString() result 1 One or two: 1 2 One or two: 2 2 Two or three: 2 3 Two or three: 3 // Understand: twothree and onetwo overlap two, but unlike the other ool, ql doesn't conflict, it co-exists. --- command. from OneTwo o select o, () result: 1 One or two: 1 1 One or two: 1 2 One or two: 2 2 Two or three: 2 // Understand: twothree and onetwo both reconstruct 2, and coexist because ql doesn't conflict. Since o is of type onetwo, the "foundation" is 1 and 2, and then the 2 reconstructed by twothree is added. --- command. from TwoThree o select o, () result: 2 One or two: 2 2 One or two: 2 2 Two or three: 2 3 Two or three: 3 // Understand: twothree and onetwo both refactor 2, and since ql doesn't conflict, they coexist. Since o's type is twothree, the "foundations" are 2 and 3, and then the 2 reconstructed by onetwo is added. */
Case 2: A->B->C (chain of succession)
class Two extends TwoThree { Two() { this = 2 } override string getAString() { result = "Two: " + () } } from TwoThree o select o, () /* result. o getAString() result 1 One or two: 2 2 Two: 2 3 Two or three: 3 // Understand: on the basis of the above example, Two refactors the member predicates in twothree, and is therefore not coextensive with twothree */ from OneTwo o select o, () /* result. o getAString() result 1 One or two: 1 2 One or two: 2 3 Two: 2 // Understand: building on the previous example, OneTwo and TwoThree coexist, but Two overrides part of TwoThree (i.e., Two and TwoThree are not coexisting) */
Stage Summary: According to the above study of so many examples, it is very simple to summarize, the core idea is to figure out the "inheritance chain relationship". If two classes are inherited from the same parent class, then the results of the two coexist; if the two classes are subordinate (parent and child), then the child class overrides the corresponding part of the parent class.
For example, in the above example, OneTwo and TwoThree are in a concurrent relationship, inheriting OneTwoThree at the same time, so their results coexist and do not conflict; TwoThree and Two are in a subordinate relationship, so according to the principle of the most-subclassed-first principle, they override what is in the corresponding TwoThree (and Two also inherits indirectly from OneTwoThree, so that has an effect on (Two also inherits indirectly from OneTwoThree, so it affects all parent classes including OneTwoThree).
Scenario 3: Multiple Inheritance
class Two extends OneTwo, TwoThree { Two() { this = 2 This = 2 } override string getAString() { result = "Two: " + () } } // Explanation 1: Two inherits from both TwoThree and OneTwo, and if the conditional predicate is not written, the default is to satisfy both parent conditions, and if it is written, the range must also be less than or equal to this intersection range. // Explanation 2: If there are multiple definitions of a member predicate of the same name in a parent class of multiple inheritance, these definitions must be overridden to avoid ambiguity. In this case Two's getAString() cannot be omitted from OneTwoThree o select o, () /* result. o getAString() result 1 One or two: 1 2 Two: 2 3 Two or three: 3 // Understand: since two is a parent-child relationship with onetwo and twothree, it directly overwrites all of the shared two, not a concurrency relationship. //
On top of that go ahead and create a predicate for determining whether or not it is bald isBald
predicate isBald(Person p) {
not exists(string c | () = c) // Not adding not means someone has hair.
}
// Get the final result, allowing southern baldness into the north
from Southerner s
where ("north") and isBald(s)
select s, ()