Skip to content

Commit a154c94

Browse files
committed
Improved handling of iteration in the compiler.
This commit improves the handling of iteration in a couple of ways. - First, it enables pattern matching in `for` loops and `FlatMap` (vmware-archive#784). One can now write: ``` for ((k,v) in map) {} ``` instead of ``` for ((kv in map) { var k = kv.0} ``` and ``` (var x, var y) = FlatMap(expr) ``` instead of: ``` var xy = FlatMap(expr), var x = xy.0 ``` In general, the pattern can contain constructor names, tuples, placeholders, and variable declarations. The pattern must be irrefutable, i.e., any constructor name used in the pattern must be the unique constructor for the given type. - Second, we remove the hardwired knowledge about iterable types from the compiler. Until now the compiler only knew how to iterate over `Vec`, `Set`, `Map`, `Group`, and `TinySet` types. Instead we now allow the programmer to label any extern type as iterable, meaning that it implements `iter()` and `into_iter()` methods, that return Rust iterators using one of two attributes: ``` #[iterate_by_ref=iter:<type>] ``` or ``` #[iterate_by_val=iter:<type>] ``` where `type` is the type yielded by the `next()` method of the iterator The former indicates that the `iter()` method returns a by-reference iterator, the latter indicates that `iter()` returns a by-value iterator. Example: ``` #[iterate_by_val=iter:('K,'V)] extern type Map<'K,'V> ``` This feature will enable us to introduce new container types, e.g., immutable sets and maps in the future. Unfortunately, it introduces one, hopefully minor, usability regression. Since we can no longer distinguish a map from any other container whose iterator is a 2-tuple, we cannot reliably convert DDlog maps to/from Java maps in the FlatBuf-based API. They must therefore be passed around as arrays of key-value tuples and converted to/from Java maps by the client. Future todos: - Support pattern matching in closure arguments and the LHS of `group_by` clauses. - Should refutable patterns be supported for `FlatMap`? This would have the effect of filtering the flattened collection.
1 parent 5294a7c commit a154c94

31 files changed

+798
-422
lines changed

CHANGELOG.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,50 @@ All notable changes to this project will be documented in this file.
33

44
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
55

6+
## [Unreleased]
7+
8+
### Language improvements
9+
10+
- Enable pattern matching in `for` loops and `FlatMap`. One can now write:
11+
```
12+
for ((k,v) in map) {}
13+
```
14+
instead of
15+
```
16+
for ((kv in map) { var k = kv.0}
17+
```
18+
and
19+
```
20+
(var x, var y) = FlatMap(expr)
21+
```
22+
instead of:
23+
```
24+
var xy = FlatMap(expr),
25+
var x = xy.0
26+
```
27+
28+
- Remove the hardwired knowledge about iterable types from the compiler.
29+
Until now the compiler only knew how to iterate over `Vec`, `Set`,
30+
`Map`, `Group`, and `TinySet` types. Instead we now allow the programmer
31+
to label any extern type as iterable, meaning that it implements `iter()`
32+
and `into_iter()` methods, that return Rust iterators using one of two
33+
attributes:
34+
```
35+
#[iterate_by_ref=iter:<type>]
36+
```
37+
or
38+
```
39+
#[iterate_by_val=iter:<type>]
40+
```
41+
where `type` is the type yielded by the `next()` method of the iterator
42+
The former indicates that the `iter()` method returns a by-reference
43+
iterator, the latter indicates that `iter()` returns a by-value
44+
iterator.
45+
46+
As a side effect of this change, the compiler no longer distinguishes
47+
maps from other cotnainers that over 2-tuples. Therefore maps are
48+
represented as lists of tuples in the Flatbuf-based Java API.
49+
650
## [0.36.0] - Feb 7, 2021
751
852
### API changes

doc/java_api.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -422,5 +422,5 @@ the same table.
422422
| Struct: `S<t1,..,tN>` | `S__t1..__tnWriter` | `S__t1..__tnReader` |
423423
| `Vec<T>` | `List<T_w>` | `List<T_r>` |
424424
| `Set<T>` | `List<T_w>` | `List<T_r>` |
425-
| `Map<K,V>` | `Map<K_w,V_w>` | `Map<K_r,V_r>` |
425+
| `Map<K,V>` | `List<Tuple2__K_w__V_w>` | `Map<Tuple2__K_r__V_r>` |
426426
| `Ref<T>` | `T_w` | `T_r` |

doc/language_reference/language_reference.md

Lines changed: 27 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -497,7 +497,7 @@ cons_term ::= (* positional arguments *)
497497
("," "." field_name "=" expr)* "}"]
498498
var_term ::= var_name
499499
ite_term ::= "if" term term [ "else" term ]
500-
for_term ::= "for" "(" var_name "in" expr ")" term
500+
for_term ::= "for" "(" for_pattern "in" expr ")" term
501501
return_term ::= "return" [expr]
502502
vardecl_term ::= "var" var_name
503503
@@ -526,6 +526,20 @@ pattern ::= (* tuple pattern *)
526526
| "_" (* wildcard, matches any value *)
527527
```
528528

529+
```EBNF
530+
(* pattern that binds loop variables in a for-loop *)
531+
for_pattern ::= (* tuple pattern *)
532+
"(" [pattern (,pattern)* ")"
533+
(* constructor pattern with positional arguments. The constructor must be the unique constructor for the type.*)
534+
| cons_name ["{" [pattern (,pattern)*] "}"]
535+
(* constructor pattern with named arguments. The constructor must be the unique constructor for the type. *)
536+
| cons_name "{" ["." field_name "=" pattern
537+
("," "." field_name "=" pattern)*] "}"
538+
| vardecl_term (* binds variable to a field inside the matched value or the entire value. *)
539+
| var_term (* binds variable to a field inside the matched value (shorthand for vardecl_term) *)
540+
| "_" (* wildcard, matches any value *)
541+
```
542+
529543
### Constraints on expressions
530544

531545
1. Number and types of arguments to a function must match function
@@ -647,9 +661,20 @@ rhs_clause ::= atom (* 1.atom *)
647661
| "not" atom (* 2.negated atom *)
648662
| expr (* 3.condition *)
649663
| expr "=" expr (* 4.assignment *)
650-
| "var" var_name "=" "FlatMap" "(" expr ")" (* 5.flat map *)
664+
| flatmap_pattern "=" "FlatMap" "(" expr ")" (* 5.flat map *)
651665
| "var" var_name = expr "." "group_by" (* 6.grouping; in general a *)
652666
"(" expr ")" (* group_by clause can be any expression containing `expr.group_by(expr)` subexpression. *)
667+
668+
(* pattern that binds variables in the left-hand side of a FlatMap clause *)
669+
flatmap_pattern ::= (* tuple pattern *)
670+
"(" [pattern (,pattern)* ")"
671+
(* constructor pattern with positional arguments. The constructor must be the unique constructor for a type. *)
672+
| cons_name ["{" [pattern (,pattern)*] "}"]
673+
(* constructor pattern with named arguments. The constructor must be the unique constructor for the type. *)
674+
| cons_name "{" ["." field_name "=" pattern
675+
("," "." field_name "=" pattern)*] "}"
676+
| vardecl_term (* binds variable to a field inside the matched value or the entire value. *)
677+
| "_" (* wildcard, matches any value *)
653678
```
654679

655680
An atom is a predicate that holds when a given value belongs to a relation.

doc/tutorial/tutorial.md

Lines changed: 37 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1481,11 +1481,11 @@ containing item name, vendor, and price:
14811481
function best_vendor_string(g: Group<string, (string, bit<64>)>): string
14821482
{
14831483
var min_vendor = "";
1484-
var min_price: bit<64> = 'hffffffffffffffff;
1485-
for (vendor_price in g) {
1486-
if (vendor_price.1 < min_price) {
1487-
min_vendor = vendor_price.0;
1488-
min_price = vendor_price.1
1484+
var min_price = 'hffffffffffffffff;
1485+
for ((vendor, price) in g) {
1486+
if (price < min_price) {
1487+
min_vendor = vendor;
1488+
min_price = price;
14891489
}
14901490
};
14911491
"Best deal for ${group_key(g)}: ${min_vendor}, $${min_price}"
@@ -3060,6 +3060,38 @@ function returns a reference in Rust:
30603060
extern function deref(x: Ref<'A>): 'A
30613061
```
30623062
3063+
### `#[iterate_by_ref]` and `#[iterate_by_val]`
3064+
3065+
These attributes apply to extern types only and tell the compiler that the given
3066+
type is **iterable**. Iterable types can be iterated over in a for-loop and
3067+
flattened by `FlatMap`. The syntax for these attributes is:
3068+
3069+
```
3070+
#[iterate_by_ref=iter:<type>]
3071+
```
3072+
3073+
```
3074+
#[iterate_by_val=iter:<type>]
3075+
```
3076+
3077+
Both variants tell the DDlog compiler that the Rust type that the extern type
3078+
declaration binds to implements the `iter()` method, which returns a type that
3079+
implements the Rust `Iterator` trait, and that the `next()` method of the
3080+
resulting iterator returns values of type `<type>`. The first form indicates
3081+
that the iterator returns the contents of the collection by reference, while
3082+
the second form indicates that the iterator returns elements in the collection
3083+
by value.
3084+
3085+
Here are two examples from `ddlog_std.dl`:
3086+
3087+
```
3088+
#[iterate_by_ref=iter:'A]
3089+
extern type Vec<'A>
3090+
3091+
#[iterate_by_val=iter:('K,'V)]
3092+
extern type Map<'K,'V>
3093+
```
3094+
30633095
### `#[deserialize_from_array=func()]`
30643096
30653097
This attribute is used in conjunction with `json.dl` library (and potentially

java/test_flatbuf1/Test.java

Lines changed: 35 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -57,11 +57,11 @@ private String printNI(NIReader v) {
5757
}
5858

5959
private String printQI(QIReader v) {
60-
Map<Long,String> vs = v.m();
60+
List<Tuple2__bit_32___stringReader> vs = v.m();
6161
ArrayList<String> vs_strs = new ArrayList<String>(vs.size());
62-
for (Map.Entry<Long,String> entry : vs.entrySet()) {
63-
vs_strs.add(entry.getKey() + "=>\"" + entry.getValue() + "\"");
64-
}
62+
vs.forEach((t) -> {
63+
vs_strs.add(t.a0() + "=>\"" + t.a1() + "\"");
64+
});
6565
return "QI{{" + String.join(", ", vs_strs) + "}}";
6666
}
6767

@@ -117,19 +117,15 @@ private String printOptString(Object v) {
117117
}
118118
}
119119

120-
private String printZI5(Map<String, ManyReader> v) {
121-
ArrayList<String> strings = new ArrayList<String>(v.size());
122-
for (Map.Entry<String, ManyReader> e: v.entrySet()) {
123-
strings.add("\"" + e.getKey() + "\"=>" + printMany(e.getValue()));
124-
}
120+
private String printZI5(List<Tuple2__string__ManyReader> vs) {
121+
ArrayList<String> strings = new ArrayList<String>(vs.size());
122+
vs.forEach((t) -> strings.add("\"" + t.a0() + "\"=>" + printMany(t.a1())));
125123
return "{" + String.join(", ", strings) + "}";
126124
}
127125

128-
private String printZI11(Map<ManyReader, String> v) {
129-
ArrayList<String> strings = new ArrayList<String>(v.size());
130-
for (Map.Entry<ManyReader, String> e: v.entrySet()) {
131-
strings.add(printMany(e.getKey()) + "=>\"" + e.getValue() + "\"");
132-
}
126+
private String printZI11(List<Tuple2__Many__stringReader> vs) {
127+
ArrayList<String> strings = new ArrayList<String>(vs.size());
128+
vs.forEach((t) -> strings.add(printMany(t.a0()) + "=>\"" + t.a1() + "\""));
133129
return "{" + String.join(", ", strings) + "}";
134130
}
135131

@@ -427,7 +423,7 @@ void onFBCommit(DDlogCommand<Object> command) throws IOException {
427423
}
428424
// output relation OZI5[Map<string, Many>]
429425
case typesTestRelation.OZI5: {
430-
fb_file.println("From " + relid + " " + command.kind() + " " + printZI5((Map<String, ManyReader>)command.value()));
426+
fb_file.println("From " + relid + " " + command.kind() + " " + printZI5((List<Tuple2__string__ManyReader>)command.value()));
431427
break;
432428
}
433429
// output relation OZI6[Option<string>]
@@ -461,7 +457,7 @@ void onFBCommit(DDlogCommand<Object> command) throws IOException {
461457
}
462458
// output relation OZI11[Map<Many, string>]
463459
case typesTestRelation.OZI11: {
464-
fb_file.println("From " + relid + " " + command.kind() + " " + printZI11((Map<ManyReader,String>)command.value()));
460+
fb_file.println("From " + relid + " " + command.kind() + " " + printZI11((List<Tuple2__Many__stringReader>)command.value()));
465461
break;
466462
}
467463
// output relation OZI12[(string, bigint, Vec<bigint>, (bit<16>, Many))]
@@ -612,9 +608,9 @@ void update() throws DDlogException {
612608
builder.insert_PI5(pi);
613609
}
614610
{
615-
Map<Long, String> map = new HashMap<Long, String>();
616-
map.put(Long.valueOf(2), "here");
617-
map.put(Long.valueOf(3), "there");
611+
ArrayList<Tuple2__bit_32___stringWriter> map = new ArrayList<Tuple2__bit_32___stringWriter>();
612+
map.add(builder.create_Tuple2__bit_32___string(2, "here"));
613+
map.add(builder.create_Tuple2__bit_32___string(3, "there"));
618614
builder.insert_QI(map);
619615
}
620616
builder.insert_RI(2);
@@ -677,9 +673,9 @@ void update() throws DDlogException {
677673
builder.insert_ZI4(strings);
678674
}
679675
{
680-
Map<String, ManyWriter> map = new HashMap<String, ManyWriter>();
681-
map.put("key1", builder.create_B(false));
682-
map.put("key2", builder.create_A("val2"));
676+
ArrayList<Tuple2__string__ManyWriter> map = new ArrayList<Tuple2__string__ManyWriter>();
677+
map.add(builder.create_Tuple2__string__Many("key1", builder.create_B(false)));
678+
map.add(builder.create_Tuple2__string__Many("key2", builder.create_A("val2")));
683679
builder.insert_ZI5(map);
684680
}
685681
builder.insert_ZI6_ddlog_std_Some("ZI6");
@@ -690,11 +686,11 @@ void update() throws DDlogException {
690686
builder.insert_ZI9(100);
691687
builder.insert_ZI10("Ref<IString>");
692688
{
693-
Map<ManyWriter, String> map = new HashMap<ManyWriter, String>();
689+
ArrayList<Tuple2__Many__stringWriter> map = new ArrayList<Tuple2__Many__stringWriter>();
694690
//map.put(builder.create_B(false), "v1");
695691
// Cannot add more than one record, since the map is printed in
696692
// non-deterministic order, so the diff fails.
697-
map.put(builder.create_A("val2"), "v2");
693+
map.add(builder.create_Tuple2__Many__string(builder.create_A("val2"), "v2"));
698694
builder.insert_ZI11(map);
699695
}
700696
{
@@ -1103,12 +1099,16 @@ void run() throws IOException, DDlogException {
11031099

11041100
query_file.println("Query QI_by_m[(2=>\"here\", 3=>\"there\"]:");
11051101
{
1106-
Map<Long, String> map = new HashMap<Long, String>();
1107-
map.put(Long.valueOf(2), "here");
1108-
map.put(Long.valueOf(3), "there");
1109-
typesTestQuery.queryQI_by_m(this.api, map, v -> {
1110-
query_file.println(printQI(v));
1111-
});
1102+
typesTestQuery.queryQI_by_m(this.api,
1103+
bldr -> {
1104+
ArrayList<Tuple2__bit_32___stringWriter> map = new ArrayList<Tuple2__bit_32___stringWriter>();
1105+
map.add(bldr.create_Tuple2__bit_32___string(Long.valueOf(2), "here"));
1106+
map.add(bldr.create_Tuple2__bit_32___string(Long.valueOf(3), "there"));
1107+
return map;
1108+
},
1109+
v -> {
1110+
query_file.println(printQI(v));
1111+
});
11121112
}
11131113

11141114
query_file.println("Query RI_by_refm[2]:");
@@ -1274,9 +1274,9 @@ void run() throws IOException, DDlogException {
12741274
query_file.println("Query ZI5_by_self[...]:");
12751275
typesTestQuery.queryZI5_by_self(this.api,
12761276
bldr -> {
1277-
Map<String, ManyWriter> map = new HashMap<String, ManyWriter>();
1278-
map.put("key1", bldr.create_B(false));
1279-
map.put("key2", bldr.create_A("val2"));
1277+
ArrayList<Tuple2__string__ManyWriter> map = new ArrayList<Tuple2__string__ManyWriter>();
1278+
map.add(bldr.create_Tuple2__string__Many("key1", bldr.create_B(false)));
1279+
map.add(bldr.create_Tuple2__string__Many("key2", bldr.create_A("val2")));
12801280
return map;
12811281
},
12821282
v -> { query_file.println(printZI5(v)); });
@@ -1322,8 +1322,8 @@ void run() throws IOException, DDlogException {
13221322
query_file.println("Query ZI11_by_self[\"val2\"=>\"v2\"]:");
13231323
typesTestQuery.queryZI11_by_self(this.api,
13241324
bldr -> {
1325-
Map<ManyWriter, String> map = new HashMap<ManyWriter, String>();
1326-
map.put(bldr.create_A("val2"), "v2");
1325+
ArrayList<Tuple2__Many__stringWriter> map = new ArrayList<Tuple2__Many__stringWriter>();
1326+
map.add(bldr.create_Tuple2__Many__string(bldr.create_A("val2"), "v2"));
13271327
return map;
13281328
},
13291329
v -> { query_file.println(printZI11(v)); });

lib/ddlog_std.dl

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -435,6 +435,7 @@ extern function hash128(x: 'X): bit<128>
435435
* represents a non-empty list of objects.
436436
* `'K` is the type of group key, and `'V` is the type of value in the group.
437437
*/
438+
#[iterate_by_val=iter:'V]
438439
extern type Group<'K,'V>
439440

440441
/* Extracts group key.
@@ -559,6 +560,7 @@ function max(g: Group<'K, 'V>): 'V {
559560
* Vec
560561
*/
561562
#[dyn_alloc]
563+
#[iterate_by_ref=iter:'A]
562564
extern type Vec<'A>
563565

564566
extern function vec_empty(): Vec<'A>
@@ -669,6 +671,7 @@ function update_nth(v: mut Vec<'X>, idx: usize, value: 'X): bool {
669671
*/
670672

671673
#[dyn_alloc]
674+
#[iterate_by_val=iter:('K,'V)]
672675
extern type Map<'K,'V>
673676

674677
extern function map_empty(): Map<'K, 'V>
@@ -716,6 +719,7 @@ function keys(m: Map<'K, 'V>): Vec<'K> {
716719
*/
717720

718721
#[dyn_alloc]
722+
#[iterate_by_ref=iter:'A]
719723
extern type Set<'A>
720724

721725
extern function set_singleton(x: 'X): Set<'X>

lib/tinyset.dl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
* copying (e.g., 8, 16, 32, or 64-bit integers). */
66

77
#[dyn_alloc]
8+
#[iterate_by_val=iter:'X]
89
extern type Set64<'X>
910

1011
extern function size(s: Set64<'X>): bit<64>

0 commit comments

Comments
 (0)