Budowa własnego generatora encji w PHP na bazie DSL (TreeStructInfo 2.0)
Wprowadzenie
W świecie nowoczesnych aplikacji backendowych ogromną rolę odgrywa automatyzacja. Generowanie kodu, zarządzanie schematami danych czy synchronizacja modeli z bazą danych to zadania, które – jeśli wykonywane ręcznie – szybko stają się źródłem błędów i frustracji.
Jednym z podejść do rozwiązania tego problemu jest stworzenie własnego DSL (Domain-Specific Language) oraz narzędzi do jego przetwarzania. W tym artykule pokażę, jak zbudować prosty, ale bardzo elastyczny system oparty o:
- własny format tekstowy (TreeStructInfo 2.0),
- parser przekształcający go do struktury danych,
- generator kodu PHP tworzący klasy encji.
Czym jest TreeStructInfo?
TreeStructInfo to tekstowy format strukturalny przypominający uproszczone drzewo AST. Pozwala opisywać dane w sposób hierarchiczny:
treestructinfo "2.0" name "Entity Generator Demo"
node User
attr TableName "users"
node Fields
node Id
attr Type "int"
attr PrimaryKey "true"
end node
end node
end node
end tree
Kluczowe elementy:
node– reprezentuje strukturę (np. encję, pole)attr– atrybut danego elementu- zagnieżdżenia – budują strukturę drzewa
Parser – serce systemu
Parser zamienia tekst na strukturę PHP:
[
'version' => '2.0',
'name' => 'Entity Generator Demo',
'data' => [
'User' => [
'TableName' => 'users',
'Fields' => [
'Id' => [
'Type' => 'int',
'PrimaryKey' => true
]
]
]
]
]
Funkcje parsera:
- obsługa zagnieżdżeń (
node) - parsowanie atrybutów (
attr) - referencje (
ref attr,ref node) - auto-casting typów (np.
"true"→true)
To w praktyce tworzy AST (Abstract Syntax Tree) Twojego DSL.
Generator encji
Na bazie danych z parsera możemy generować kod PHP:
class User
{
private int $id;
public function getId(): ?int
{
return $this->id;
}
public function setId(int $id): self
{
$this->id = $id;
return $this;
}
}
Kluczowy fragment generatora:
foreach ($fields as $fieldName => $field) {
$type = $this->mapType($field['Type'] ?? 'string');
$propName = lcfirst($fieldName);
$code .= "private {$type} \${$propName};";
}
Generator:
- mapuje typy (
int,string,datetime) - tworzy właściwości
- generuje gettery i settery
Problem dopasowania struktury
Jednym z najczęstszych błędów jest niedopasowanie danych wejściowych do generatora.
Parser zwraca:
$data = [
'User' => [...],
'Post' => [...]
];
Generator natomiast często oczekuje pojedynczej encji.
Rozwiązanie:
foreach ($data as $entityName => $entityData) {
echo $generator->generate($entityName, $entityData);
}
Referencje – reużywalność
TreeStructInfo wspiera mechanizm referencji:
Atrybuty:
ref attr DefaultLength "255"
Użycie:
ref attr DefaultLength
Node:
ref node Timestamps
Dzięki temu możesz:
- unikać duplikacji
- tworzyć modułowe definicje
- budować reusable schematy
Zalety podejścia DSL + generator
1. Jedno źródło prawdy
Schemat opisujesz tylko raz.
2. Automatyzacja
Z jednego pliku możesz wygenerować:
- encje PHP
- SQL
- API
- walidację
3. Elastyczność
Możesz rozszerzać język o własne konstrukcje:
- relacje
- walidacje
- indeksy
4. Skalowalność
Projekt rośnie bez chaosu w modelach.
Możliwe rozszerzenia
Relacje
attr Relation "ManyToOne:User"
Walidacja
attr Validate "email"
Generowanie SQL
CREATE TABLE users (...)
Integracja z ORM (np. Doctrine)
CLI tool
php tsi generate entity schema.tsi
Działający przykład
Parser TreeStructInfo 2.0 TreeStructInfoParser.php
<?php
/**
* TreeStructInfoParser
*
* A parser for the TreeStructInfo text-based structured format.
* Supports attribute parsing, nested nodes, reference attributes,
* auto type casting, and caching.
* Licensed under the BSD 3-Clause License.
* See the LICENSE.md file for full license text.
*/
class TreeStructInfoParser
{
private $lines;
private $refLines;
private $currentLine = 0;
private $currentRefLine = 0;
private $references = [];
private $nodeReferences = [];
private $autoCasting = false;
private $multilineJoinChar = PHP_EOL;
private $trueValues = ['true'];
private $falseValues = ['false'];
/**
* Enables or disables automatic value casting.
*
* @param bool $value
*/
public function setAutoCasting($value)
{
$this->autoCasting = $value;
}
/**
* Sets the character or string used to join multiline attribute values.
*
* @param string $char
*/
public function setMultilineJoinChar($char)
{
$this->multilineJoinChar = $char;
}
/**
* Sets the boolean string representations used for auto casting.
*
* @param array $trueValues
* @param array $falseValues
* @throws Exception
*/
public function setBoolValues($trueValues, $falseValues)
{
if (empty($trueValues)) {
throw new Exception("Empty True values");
}
if (empty($falseValues)) {
throw new Exception("Empty False values");
}
$this->trueValues = $trueValues;
$this->falseValues = $falseValues;
}
/**
* Parses a TreeStructInfo string and returns a structured array.
*
* @param string $input
* @return array
* @throws Exception
*/
public function parse($input)
{
$allLines = array_values(array_filter(array_map('trim', explode("\n", $input))));
$treeLines = [];
$i = 0;
$lastRefAttrName = null;
$lastRefNodeName = null;
$endTree = false;
$this->refLines = [];
$this->currentRefLine = 0;
while ($i < count($allLines)) {
$line = $allLines[$i];
// ref attr Name "Value"
if (preg_match('/^ref attr (.+?)\s+"([^"]+)"$/', $line, $matches)) {
$this->references[$matches[1]] = $matches[2];
$lastRefAttrName = $matches[1];
$i++;
continue;
}
// next line values on ref attr Name
if (($lastRefAttrName !== null) && (str_starts_with($line, '"')))
{
$this->references[$lastRefAttrName] .= $this->multilineJoinChar . trim($line, '"');
$i++;
continue;
}
if (str_starts_with($line, 'end tree'))
{
$endTree = true;
}
// ref node Some ...
if ($endTree && preg_match('/^ref node (.+?)$/', $line, $matches)) {
$refName = $matches[1];
$this->refLines[] = 'node '.$refName;
$i++; // Skip ref node line
while ($i < count($allLines) && trim($allLines[$i]) !== 'end ref node') {
$this->refLines[] = $allLines[$i];
$i++;
}
$this->refLines[] = 'end node';
if (!isset($allLines[$i]) || trim($allLines[$i]) !== 'end ref node') {
throw new Exception("Missing 'end ref node' for ref node $refName");
}
$i++; // Skip 'end ref node'
$refNode = $this->parseAnonymousNode();
$this->nodeReferences[$refName] = $refNode[$refName];
continue;
}
$treeLines[] = $line;
$i++;
}
$this->lines = $treeLines;
$this->currentLine = 0;
return $this->parseTree();
}
/**
* Parses a TreeStructInfo file and caches the result as PHP.
* If cache is newer than source, returns the cached version.
*
* @param string $inputFile
* @param string $cacheFile
* @return array
*/
public function parseWithCache($inputFile, $cacheFile)
{
if (file_exists($cacheFile) && filemtime($cacheFile) >= filemtime($inputFile)) {
return include $cacheFile;
}
$input = file_get_contents($inputFile);
$parsed = $this->parse($input);
$export = var_export($parsed, true);
file_put_contents($cacheFile, "<?php\nreturn $export;\n");
return $parsed;
}
/**
* Parses a reference node structure (with optional children).
*
* @return array
* @throws Exception
*/
private function parseAnonymousNode()
{
$line = $this->nextRefLine();
if (!preg_match('/^node\s+(.+?)$/', $line, $matches)) {
throw new Exception("Invalid node start: $line");
}
$name = $matches[1];
$node[$name] = [];
$lastname = null;
while (($line = $this->peekRefLine()) !== null && $line !== 'end node') {
if (str_starts_with($line, '::')) {
$this->nextRefLine();
} elseif (str_starts_with($line, 'attr ')) {
$attr = $this->parseAttr($this->nextRefLine());
$node[$name][$attr['name']] = $attr['value'];
$lastname = $attr['name'];
} elseif (str_starts_with($line, '"')) {
$attr = $this->nextRefLine();
if ($lastname) {
$node[$name][$lastname] .= $this->multilineJoinChar . trim($attr, '"');
}
} elseif (str_starts_with($line, 'ref attr ')) {
$attr = $this->resolveRefAttr($this->nextRefLine());
$node[$name][$attr['name']] = $attr['value'];
} elseif (str_starts_with($line, 'ref node ')) {
$refnode = $this->resolveRefNode($this->nextRefLine());
$childname = $refnode['name'];
$childvalue = $refnode['value'];
$node[$name][$childname] = $childvalue;
} elseif (str_starts_with($line, 'node ')) {
$childnode = $this->parseAnonymousNode();
$node[$name] = (isset($node[$name]) && is_array($node[$name])) ? array_merge($node[$name], $childnode) : $childnode;
} else {
throw new Exception("Unexpected line in node: $line");
}
}
$this->expectRefLine('end node');
return $node;
}
/**
* Parses the main tree structure.
*
* @return array
* @throws Exception
*/
private function parseTree()
{
$header = $this->nextLine();
while (str_starts_with($header, '::')) {
$header = $this->nextLine();
}
if (!preg_match('/^treestructinfo\s+"(\d+\.\d+)"(?:\s+name\s+"([^"]+)")?$/', $header, $matches)) {
throw new Exception("Invalid header format");
}
$result = [];
$result['version'] = $matches[1];
$result['name'] = isset($matches[2]) ? $matches[2] : null;
$tree = [];
$lastname = null;
while (($line = $this->peekLine()) !== null && $line !== 'end tree') {
if (str_starts_with($line, '::')) {
$this->nextLine(); // skip comment
} elseif (str_starts_with($line, 'attr ')) {
$attr = $this->parseAttr($this->nextLine());
$tree[$attr['name']] = $attr['value'];
$lastname = $attr['name'];
} elseif (str_starts_with($line, '"')) {
$attr = $this->nextLine();
if ($lastname) {
$tree[$lastname] .= $this->multilineJoinChar . trim($attr, '"');
}
} elseif (str_starts_with($line, 'ref attr ')) {
$attr = $this->resolveRefAttr($this->nextLine());
$tree[$attr['name']] = $attr['value'];
} elseif (str_starts_with($line, 'ref node ')) {
$refnode = $this->resolveRefNode($this->nextLine());
$tree[$refnode['name']] = $refnode['value'];
} elseif (str_starts_with($line, 'node ')) {
$node = $this->parseNode();
$tree = array_merge($tree, $node);
} else {
throw new Exception("Unexpected line in tree: $line");
}
}
$this->expectLine('end tree');
$result['data'] = $tree;
return $result;
}
/**
* Parses a single node structure (with optional children).
*
* @return array
* @throws Exception
*/
private function parseNode()
{
$line = $this->nextLine();
if (!preg_match('/^node\s+(.+?)$/', $line, $matches)) {
throw new Exception("Invalid node start: $line");
}
$name = $matches[1];
$node[$name] = [];
$lastname = null;
while (($line = $this->peekLine()) !== null && $line !== 'end node') {
if (str_starts_with($line, '::')) {
$this->nextLine();
} elseif (str_starts_with($line, 'attr ')) {
$attr = $this->parseAttr($this->nextLine());
$node[$name][$attr['name']] = $attr['value'];
$lastname = $attr['name'];
} elseif (str_starts_with($line, '"')) {
$attr = $this->nextLine();
if ($lastname) {
$node[$name][$lastname] .= $this->multilineJoinChar . trim($attr, '"');
}
} elseif (str_starts_with($line, 'ref attr ')) {
$attr = $this->resolveRefAttr($this->nextLine());
$node[$name][$attr['name']] = $attr['value'];
} elseif (str_starts_with($line, 'ref node ')) {
$refnode = $this->resolveRefNode($this->nextLine());
$node[$name][$refnode['name']] = $refnode['value'];
} elseif (str_starts_with($line, 'node ')) {
$childnode = $this->parseNode();
$node[$name] = (isset($node[$name]) && is_array($node[$name])) ? array_merge($node[$name], $childnode) : $childnode;
} else {
throw new Exception("Unexpected line in node: $line");
}
}
$this->expectLine('end node');
return $node;
}
/**
* Parses a single attribute line.
*
* @param string $line
* @return array
* @throws Exception
*/
private function parseAttr($line)
{
if (!preg_match('/^attr\s+(.+?)\s+"([^"]*)"?$/', $line, $matches)) {
throw new Exception("Invalid attribute: $line");
}
$name = $matches[1];
$value = isset($matches[2]) ? $matches[2] : '';
return [
'name' => $name,
'value' => $this->autoCasting ? $this->autoCast($value) : $value,
];
}
/**
* Resolves a reference attribute from previously stored values.
*
* @param string $line
* @return array
* @throws Exception
*/
private function resolveRefAttr($line)
{
if (!preg_match('/^ref attr (.+)$/', $line, $matches)) {
throw new Exception("Invalid ref attr usage: $line");
}
$refName = $matches[1];
if (!isset($this->references[$refName])) {
throw new Exception("Undefined reference attribute: $refName");
}
return [
'name' => $refName,
'value' => $this->autoCasting ? $this->autoCast($this->references[$refName]) : $this->references[$refName],
];
}
/**
* Resolves a reference node from previously stored nodes.
*
* @param string $line
* @return array
* @throws Exception
*/
private function resolveRefNode($line)
{
if (!preg_match('/^ref node (.+)$/', $line, $matches)) {
throw new Exception("Invalid ref node usage: $line");
}
$refName = $matches[1];
if (!isset($this->nodeReferences[$refName])) {
throw new Exception("Undefined reference node: $refName");
}
return [
'name' => $refName,
'value' => $this->nodeReferences[$refName],
];
}
/**
* Peeks at the current line without advancing the line pointer.
*
* @return string|null
*/
private function peekLine()
{
return isset($this->lines[$this->currentLine]) ? $this->lines[$this->currentLine] : null;
}
/**
* Peeks at the current ref line without advancing the line pointer.
*
* @return string|null
*/
private function peekRefLine()
{
return isset($this->refLines[$this->currentRefLine]) ? $this->refLines[$this->currentRefLine] : null;
}
/**
* Retrieves the current line and advances the line pointer.
*
* @return string|null
*/
private function nextLine()
{
$line = $this->lines[$this->currentLine++];
return isset($line) ? $line : null;
}
/**
* Retrieves the current ref line and advances the line pointer.
*
* @return string|null
*/
private function nextRefLine()
{
$line = $this->refLines[$this->currentRefLine++];
return isset($line) ? $line : null;
}
/**
* Ensures the next line matches the expected value.
*
* @param string $expected
* @throws Exception
*/
private function expectLine($expected)
{
$line = $this->nextLine();
if ($line !== $expected) {
throw new Exception("Expected '$expected' but found '$line'");
}
}
/**
* Ensures the next ref line matches the expected value.
*
* @param string $expected
* @throws Exception
*/
private function expectRefLine($expected)
{
$line = $this->nextRefLine();
if ($line !== $expected) {
throw new Exception("Expected '$expected' but found '$line'");
}
}
/**
* Attempts to auto-cast a string to bool, int, float, or binary if applicable.
*
* @param string $value
* @return mixed
*/
public function autoCast($value)
{
$trimmed = trim($value);
// Coords check
if (preg_match('/^\+?(0[xob])?[0-9a-fA-F]+,\+?(0[xob])?[0-9a-fA-F]+$/i', $trimmed)) {
$parts = explode(',', $trimmed);
return array_map([$this, 'autoCast'], $parts);
}
// Logic check
$condTrue = false;
foreach ($this->trueValues as $v) {
$condTrue = $condTrue || (strcasecmp($trimmed, $v) === 0);
}
$condFalse = false;
foreach ($this->falseValues as $v) {
$condFalse = $condFalse || (strcasecmp($trimmed, $v) === 0);
}
if ($condTrue) return true;
if ($condFalse) return false;
// Hex (prefixed with 0x)
if (preg_match('/^0x[0-9a-fA-F]+$/', $trimmed)) return hexdec($trimmed);
// Binary (prefixed with 0b)
if (preg_match('/^0b[01]+$/', $trimmed)) return bindec($trimmed);
// Octal (prefixed with 0o or 0)
if (preg_match('/^0o[0-7]+$/i', $trimmed)) return octdec($trimmed);
// Decimal integer
if (preg_match('/^[+-]?\d+$/', $trimmed)) return (int) $trimmed;
// Decimal float
if (preg_match('/^[+-]?\d+[.,]\d+([eE][+-]?\d+)?$/', $trimmed)) {
return (float) str_replace(',', '.', $trimmed);
}
// Binary data as hex string
if (preg_match('/^[a-fA-F0-9]{4,}$/', $trimmed) && strlen($trimmed) % 2 === 0) {
return hex2bin($trimmed);
}
// Base64-encoded string
if (base64_encode(base64_decode($trimmed, true)) === $trimmed) {
return base64_decode($trimmed);
}
return $trimmed;
}
}
Generator encji EntityGenerator.php
<?php
class EntityGenerator
{
public function generate(string $entityName, array $entityData): string
{
$tableName = $entityData['TableName'] ?? strtolower($entityName);
$fields = $entityData['Fields'] ?? [];
$code = "<?php\n\n";
$code .= "class {$entityName}\n{\n";
// Properties
foreach ($fields as $fieldName => $field) {
$type = $this->mapType($field['Type'] ?? 'string');
$propName = lcfirst($fieldName);
$code .= " private {$type} \${$propName};\n";
}
$code .= "\n";
// Getters & Setters
foreach ($fields as $fieldName => $field) {
$type = $this->mapType($field['Type'] ?? 'string');
$propName = lcfirst($fieldName);
$methodName = ucfirst($propName);
// Getter
$code .= " public function get{$methodName}(): ?{$type}\n";
$code .= " {\n";
$code .= " return \$this->{$propName};\n";
$code .= " }\n\n";
// Setter
$code .= " public function set{$methodName}({$type} \${$propName}): self\n";
$code .= " {\n";
$code .= " \$this->{$propName} = \${$propName};\n";
$code .= " return \$this;\n";
$code .= " }\n\n";
}
$code .= "}\n\n";
return $code;
}
private function mapType(string $type): string
{
return match ($type) {
'int' => 'int',
'string' => 'string',
'datetime' => '\\DateTime',
'bool' => 'bool',
'text' => 'string',
default => 'mixed',
};
}
}
Przykładowy plik z definicją entity.tsi
treestructinfo "2.0" name "Entity Generator Demo"
node User
attr TableName "users"
node Fields
node Id
attr Type "int"
attr PrimaryKey "true"
attr AutoIncrement "true"
end node
node Email
attr Type "string"
ref attr DefaultStringLength
attr Unique "true"
end node
node Password
attr Type "string"
ref attr DefaultStringLength
end node
node IsActive
attr Type "bool"
ref attr ActiveFlag
end node
node CreatedAt
attr Type "datetime"
end node
end node
end node
node Post
attr TableName "posts"
node Fields
node Id
attr Type "int"
attr PrimaryKey "true"
attr AutoIncrement "true"
end node
node Title
attr Type "string"
ref attr DefaultStringLength
end node
node Content
attr Type "text"
end node
node CreatedAt
attr Type "datetime"
end node
end node
end node
end tree
:: globalne referencje atrybutów
ref attr DefaultStringLength "255"
ref attr ActiveFlag "true"
Uruchomienie generatora
<?php
include 'EntityGenerator.php';
include 'TreeStructInfoParser.php';
$parser = new TreeStructInfoParser;
$parser->setAutoCasting(false);
$content = file_get_contents('entity.tsi');
$tree = $parser->parse($content);
$data = $tree['data'];
$generator = new EntityGenerator;
foreach ($data as $entityName => $entityData) {
$text = $generator->generate($entityName, $entityData);
file_put_contents($entityName .'.php', $text);
}
Wynik
Wygenerowana encja User.php
<?php
class User
{
private int $id;
private string $email;
private string $password;
private bool $isActive;
private \DateTime $createdAt;
public function getId(): ?int
{
return $this->id;
}
public function setId(int $id): self
{
$this->id = $id;
return $this;
}
public function getEmail(): ?string
{
return $this->email;
}
public function setEmail(string $email): self
{
$this->email = $email;
return $this;
}
public function getPassword(): ?string
{
return $this->password;
}
public function setPassword(string $password): self
{
$this->password = $password;
return $this;
}
public function getIsActive(): ?bool
{
return $this->isActive;
}
public function setIsActive(bool $isActive): self
{
$this->isActive = $isActive;
return $this;
}
public function getCreatedAt(): ?\DateTime
{
return $this->createdAt;
}
public function setCreatedAt(\DateTime $createdAt): self
{
$this->createdAt = $createdAt;
return $this;
}
}
Wygenerowana encja Post.php
<?php
class Post
{
private int $id;
private string $title;
private string $content;
private \DateTime $createdAt;
public function getId(): ?int
{
return $this->id;
}
public function setId(int $id): self
{
$this->id = $id;
return $this;
}
public function getTitle(): ?string
{
return $this->title;
}
public function setTitle(string $title): self
{
$this->title = $title;
return $this;
}
public function getContent(): ?string
{
return $this->content;
}
public function setContent(string $content): self
{
$this->content = $content;
return $this;
}
public function getCreatedAt(): ?\DateTime
{
return $this->createdAt;
}
public function setCreatedAt(\DateTime $createdAt): self
{
$this->createdAt = $createdAt;
return $this;
}
}
Podsumowanie
Tworząc:
- własny DSL,
- parser,
- generator kodu,
budujesz fundament pod coś znacznie większego niż tylko generator encji.
To podejście:
- upraszcza rozwój,
- eliminuje powtarzalność,
- daje pełną kontrolę nad architekturą.
W praktyce jest to pierwszy krok do stworzenia własnego:
ORM
frameworka backendowego
systemu code generation
Strona formatu TreeStructInfo:
https://tsinfo.4programmers.net/pl/index.htm
Moja aplikacja online do testowania formatu:
http://www.dariuszrorat.ugu.pl/aplikacje/treestructinfo-tester
Podświetlenie składni użyte w tym artykule:
http://www.dariuszrorat.ugu.pl/blog/wpis/97-podswietlenie-skladni-formatu-treestructinfo-codemirror-i-highlightjs