Graphical Abstract Figure
Graphical Abstract Figure
Close modal

Abstract

Volumetric design, also called massing design, is the first and critical step in professional building design, which is sequential in nature. As the volumetric design process requires careful design decisions and iterative adjustments, the underlying sequential design process encodes valuable information for designers. Many efforts have been made to automatically generate reasonable volumetric designs, but the quality of the generated design solutions varies, and evaluating a design solution requires either a prohibitively comprehensive set of metrics or expensive human expertise. While previous approaches focused on learning only the final design instead of sequential design tasks, we propose to encode the design knowledge from a collection of expert or high-performing design sequences and extract useful representations using transformer-based models. Later we propose to utilize the learned representations for crucial downstream applications such as design preference evaluation and procedural design generation. We develop the preference model by estimating the density of the learned representations, whereas we train an autoregressive transformer model for sequential design generation. We demonstrate our ideas by leveraging a novel dataset of thousands of sequential volumetric designs. Our preference model can compare two arbitrarily given design sequences and is almost 90% accurate in evaluation against random design sequences. Our autoregressive model is also capable of autocompleting a volumetric design sequence from a partial design sequence.

References

1.
Peters
,
D. L.
,
Papalambros
,
P. Y.
, and
Ulsoy
,
A. G.
,
2011
, “
Control Proxy Functions for Sequential Design and Control Optimization
,”
ASME J. Mech. Des.
,
133
(
9
), p.
091007
.
2.
Rahman
,
M. H.
,
Xie
,
C.
, and
Sha
,
Z.
,
2021
, “
Predicting Sequential Design Decisions Using the Function-Behavior-Structure Design Process Model and Recurrent Neural Networks
,”
ASME J. Mech. Des.
,
143
(
8
), p.
081706
.
3.
Shergadwala
,
M.
,
Bilionis
,
I.
,
Kannan
,
K. N.
, and
Panchal
,
J. H.
,
2018
, “
Quantifying the Impact of Domain Knowledge and Problem Framing on Sequential Decisions in Engineering Design
,”
ASME J. Mech. Des.
,
140
(
10
), p.
101402
.
4.
Nauata
,
N.
,
Chang
,
K.-H.
,
Cheng
,
C.-Y.
,
Mori
,
G.
, and
Furukawa
,
Y.
,
2020
, “
House-gan: Relational Generative Adversarial Networks for Graph-Constrained House Layout Generation
,” Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, Aug. 23–28, Proceedings, Part I 16,
Springer
, pp.
162
177
.
5.
Chang
,
K.-H.
,
Cheng
,
C.-Y.
,
Luo
,
J.
,
Murata
,
S.
,
Nourbakhsh
,
M.
, and
Tsuji
,
Y.
,
2021
, “
Building-gan: Graph-Conditioned Architectural Volumetric Design Generation
,”
Proceedings of the IEEE/CVF International Conference on Computer Vision
,
Montreal, QC, Canada
, pp.
11956
11965
.
6.
Brown
,
T.
,
Mann
,
B.
,
Ryder
,
N.
,
Subbiah
,
M.
,
Kaplan
,
J. D.
,
Dhariwal
,
P.
,
Shyam
,
P.
, et al.,
2020
, “
Language Models are Few-Shot Learners
,”
Adv. Neural Inform. Process. Syst.
,
33
, pp.
1877
1901
.
7.
Devlin
,
J.
,
Chang
,
M.
,
Lee
,
K.
, and
Toutanova
,
K.
,
2019
, “
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
,” Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, June 2–7, Volume 1 (Long and Short Papers), Association for Computational Linguistics, pp.
4171
4186
.
8.
Bao
,
H.
,
Dong
,
L.
,
Piao
,
S.
, and
Wei
,
F.
,
2022
, “
Beit: BERT Pre-Training of Image Transformers
,” The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, Apr. 25–29, OpenReview.net.
9.
Dosovitskiy
,
A.
,
Beyer
,
L.
,
Kolesnikov
,
A.
,
Weissenborn
,
D.
,
Zhai
,
X.
,
Unterthiner
,
T.
,
Dehghani
,
M.
,
Minderer
,
M.
,
Heigold
,
G.
,
Gelly
,
S.
,
Uszkoreit
,
J.
, and
Houlsby
,
N.
,
2021
, “
An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale
,” 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3–7, OpenReview.net.
10.
Shu
,
D.
,
Cunningham
,
J.
,
Stump
,
G.
,
Miller
,
S. W.
,
Yukish
,
M. A.
,
Simpson
,
T. W.
, and
Tucker
,
C. S.
,
2020
, “
3d Design Using Generative Adversarial Networks and Physics-Based Validation
,”
ASME J. Mech. Des.
,
142
(
7
), p.
071701
.
11.
Ranade
,
R.
, and
Pathak
,
J.
,
2022
, “
Activationnet: Representation Learning to Predict Contact Quality of Interacting 3d Surfaces in Engineering Designs
,”
ASME J. Mech. Des.
,
144
(
7
), p.
071705
.
12.
Li
,
X.
,
Xie
,
C.
, and
Sha
,
Z.
,
2022
, “
A Predictive and Generative Design Approach for Three-Dimensional Mesh Shapes Using Target-Embedding Variational Autoencoder
,”
ASME J. Mech. Des.
,
144
(
11
), p.
114501
.
13.
Li
,
X.
,
Xie
,
C.
, and
Sha
,
Z.
,
2023
, “
Design Representation for Performance Evaluation of 3d Shapes in Structure-Aware Generative Design
,”
Design Sci.
,
9
, p.
e27
.
14.
Liu
,
Y.
,
Zheng
,
G.
,
Letov
,
N.
, and
Zhao
,
Y. F.
,
2021
, “
A Survey of Modeling and Optimization Methods for Multi-scale Heterogeneous Lattice Structures
,”
ASME J. Mech. Des.
,
143
(
4
), p.
040803
.
15.
Williams
,
G.
,
Meisel
,
N. A.
,
Simpson
,
T. W.
, and
McComb
,
C.
,
2019
, “
Design Repository Effectiveness for 3d Convolutional Neural Networks: Application to Additive Manufacturing
,”
ASME J. Mech. Des.
,
141
(
11
), p.
111701
.
16.
Cunningham
,
J. D.
,
Shu
,
D.
,
Simpson
,
T. W.
, and
Tucker
,
C. S.
,
2020
, “
A Sparsity Preserving Genetic Algorithm for Extracting Diverse Functional 3d Designs From Deep Generative Neural Networks
,”
Design Sci.
,
6
, p.
e11
.
17.
Yan
,
Y.
,
Mao
,
Y.
, and
Li
,
B.
,
2018
, “
Second: Sparsely Embedded Convolutional Detection
,”
Sensors
,
18
(
10
), p.
3337
.
18.
Ye
,
M.
,
Xu
,
S.
, and
Cao
,
T.
,
2020
, “
Hvnet: Hybrid Voxel Network for Lidar Based 3d Object Detection
,”
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
,
Seattle, WA
, pp.
1631
1640
.
19.
Zhou
,
Y.
, and
Tuzel
,
O.
,
2018
, “
Voxelnet: End-to-End Learning for Point Cloud Based 3d Object Detection
,”
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
,
Salt Lake City, UT
, pp.
4490
4499
.
20.
Mao
,
J.
,
Xue
,
Y.
,
Niu
,
M.
,
Bai
,
H.
,
Feng
,
J.
,
Liang
,
X.
,
Xu
,
H.
, and
Xu
,
C.
,
2021
, “
Voxel Transformer for 3d Object Detection
,”
Proceedings of the IEEE/CVF International Conference on Computer Vision
,
Montreal, QC, Canada
, pp.
3164
3173
.
21.
He
,
C.
,
Li
,
R.
,
Li
,
S.
, and
Zhang
,
L.
,
2022
, “
Voxel Set Transformer: A Set-to-Set Approach to 3d Object Detection From Point Clouds
,”
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
,
New Orleans, LA
, pp.
8417
8427
.
22.
Sanghi
,
A.
,
Chu
,
H.
,
Lambourne
,
J. G.
,
Wang
,
Y.
,
Cheng
,
C.-Y.
,
Fumero
,
M.
, and
Malekshan
,
K. R.
,
2022
, “
Clip-forge: Towards Zero-Shot Text-to-Shape Generation
,”
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
,
New Orleans, LA
, pp.
18603
18613
.
23.
Wu
,
J.
,
Zhang
,
C.
,
Xue
,
T.
,
Freeman
,
B.
, and
Tenenbaum
,
J.
,
2016
, “
Learning a Probabilistic Latent Space of Object Shapes via 3d Generative-Adversarial Modeling
,” Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, Dec. 5–10, Barcelona, Spain, pp.
82
90
.
24.
Zhou
,
L.
,
Du
,
Y.
, and
Wu
,
J.
,
2021
, “
3d Shape Generation and Completion Through Point-Voxel Diffusion
,”
Proceedings of the IEEE/CVF International Conference on Computer Vision
,
Montreal, QC, Canada
, pp.
5826
5835
.
25.
Lambourne
,
J. G.
,
Willis
,
K. D.
,
Jayaraman
,
P. K.
,
Sanghi
,
A.
,
Meltzer
,
P.
, and
Shayani
,
H.
,
2021
, “
Brepnet: A Topological Message Passing System for Solid Models
,”
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
,
Nashville, TN
, pp.
12773
12782
.
26.
Jones
,
B. T.
,
Hu
,
M.
,
Kodnongbua
,
M.
,
Kim
,
V. G.
, and
Schulz
,
A.
,
2023
, “
Self-Supervised Representation Learning for CAD
,” IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, June 17–24, Vancouver, BC, Canada, PIEEE, pp.
21327
21336
.
27.
Li
,
C.
,
Pan
,
H.
,
Bousseau
,
A.
, and
Mitra
,
N. J.
,
2020
, “
Sketch2cad: Sequential CAD Modeling by Sketching in Context
,”
ACM Trans. Graph. (Proc. SIGGRAPH Asia 2020)
,
39
(
6
), pp.
164:1
164:14
.
28.
Willis
,
K. D. D.
,
Pu
,
Y.
,
Luo
,
J.
,
Chu
,
H.
,
Du
,
T.
,
Lambourne
,
J. G.
,
Solar-Lezama
,
A.
, and
Matusik
,
W.
,
2021
, “
Fusion 360 Gallery: A Dataset and Environment for Programmatic Cad Construction From Human Design Sequences
,”
ACM Trans. Graphics (TOG)
,
40
(
4
).
29.
Willis
,
K. D. D.
,
Jayaraman
,
P. K.
,
Chu
,
H.
,
Tian
,
Y.
,
Li
,
Y.
,
Grandi
,
D.
,
Sanghi
,
A.
,
Tran
,
L.
,
Lambourne
,
J. G.
,
Solar-Lezama
,
A.
, and
Matusik
,
W.
,
2022
, “
Joinable: Learning Bottom-Up Assembly of Parametric CAD Joints
,” IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, June 18–24, New Orleans, LA, IEEE, pp.
15828
15839
.
30.
Kingma
,
D. P.
, and
Welling
,
M.
,
2013
, “
Auto-Encoding Variational Bayes
,” Preprint arXiv:1312.6114.
31.
Germain
,
M.
,
Gregor
,
K.
,
Murray
,
I.
, and
Larochelle
,
H.
,
2015
, “
Made: Masked Autoencoder for Distribution Estimation
,”
International Conference on Machine Learning
,
Lille, France
,
PMLR
, pp.
881
889
.
32.
Dinh
,
L.
,
Sohl-Dickstein
,
J.
, and
Bengio
,
S.
,
2016
, “
Density Estimation Using Real nvp
,” Preprint arXiv:1605.08803.
33.
Valdez
,
S.
,
Rodriguez
,
N.
, and
Seepersad
,
C.
,
2022
, “
Latent Variable Representations for Interactive Structural Design Exploration
,”
International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Vol. 86229
,
St. Louis, MO
,
American Society of Mechanical Engineers
, p.
V03AT03A051
.
34.
Xu
,
H.
,
Liu
,
R.
,
Choudhary
,
A.
, and
Chen
,
W.
,
2015
, “
A Machine Learning-Based Design Representation Method for Designing Heterogeneous Microstructures
,”
ASME J. Mech. Des.
,
137
(
5
), p.
051403
.
35.
Wang
,
Y.
,
Joseph
,
J.
,
Aniruddhan Unni
,
T.
,
Yamakawa
,
S.
,
Barati Farimani
,
A.
, and
Shimada
,
K.
,
2022
, “
Three-Dimensional Ship Hull Encoding and Optimization via Deep Neural Networks
,”
ASME J. Mech. Des.
,
144
(
10
), p.
101701
.
36.
Radford
,
A.
,
Kim
,
J. W.
,
Hallacy
,
C.
,
Ramesh
,
A.
,
Goh
,
G.
,
Agarwal
,
S.
,
Sastry
,
G.
,
Askell
,
A.
,
Mishkin
,
P.
,
Clark
,
J.
,
Krueger
,
G.
, and
Sutskever
,
I.
,
2021
, “
Learning Transferable Visual Models From Natural Language Supervision
,”
Proceedings of the 38th International Conference on Machine Learning, Vol. 139 of Proceedings of Machine Learning Research
,
Virtual
, PMLR, pp.
8748
8763
.
37.
Luo
,
H.
,
Ji
,
L.
,
Zhong
,
M.
,
Chen
,
Y.
,
Lei
,
W.
,
Duan
,
N.
, and
Li
,
T.
,
2022
, “
Clip4clip: An Empirical Study of Clip for End to End Video Clip Retrieval and Captioning
,”
Neurocomputing
,
508
, pp.
293
304
.
38.
Portillo-Quintero
,
J. A.
,
Ortiz-Bayliss
,
J. C.
, and
Terashima-Marín
,
H.
,
2021
, “
A Straightforward Framework for Video Retrieval Using Clip
,” Pattern Recognition: 13th Mexican Conference, MCPR 2021, Mexico City, Mexico, June 23–26, Springer, pp.
3
12
.
39.
Bain
,
M.
,
Nagrani
,
A.
,
Varol
,
G.
, and
Zisserman
,
A.
,
2021
, “
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
,” Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.
1728
1738
.
40.
Guo
,
Z. D.
,
Pires
,
B. A.
,
Piot
,
B.
,
Grill
,
J. -B.
,
Altché
,
F.
,
Munos
,
R.
, and
Azar
,
M. G.
,
2020
, “
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning
,”
International Conference on Machine Learning
,
Virtual
,
PMLR
, pp.
3875
3886
.
41.
Jaderberg
,
M.
,
Mnih
,
V.
,
Czarnecki
,
W. M.
,
Schaul
,
T.
,
Leibo
,
J. Z.
,
Silver
,
D.
, and
Kavukcuoglu
,
K.
,
2017
, “
Reinforcement Learning With Unsupervised Auxiliary Tasks
,” 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, Apr. 24–26, Conference Track Proceedings, OpenReview.net.
42.
Yu
,
T.
,
Zhang
,
Z.
,
Lan
,
C.
,
Lu
,
Y.
, and
Chen
,
Z.
,
2022
, “
Mask-Based Latent Reconstruction for Reinforcement Learning
,”
Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022
,
New Orleans, LA
,
Nov. 28–Dec. 9
.
43.
Zhu
,
J.
,
Xia
,
Y.
,
Wu
,
L.
,
Deng
,
J.
,
Zhou
,
W.
,
Qin
,
T.
,
Liu
,
T.-Y.
, and
Li
,
H.
,
2022
, “
Masked Contrastive Representation Learning for Reinforcement Learning
,” IEEE Transactions on Pattern Analysis and Machine Intelligence.
44.
Rhee
,
J.
,
Veloso
,
P.
, and
Krishnamurti
,
R.
,
2023
, “
Three Decades of Machine Learning With Neural Networks in Computer-Aided Architectural Design (1990–2021)
,”
Design Sci.
,
9
, p.
e25
.
45.
Di
,
X.
,
Yu
,
P.
,
Yang
,
D.
,
Zhu
,
H.
,
Sun
,
C.
, and
Liu
,
Y.
,
2020
, “
End-to-End Generative Floor-Plan and Layout With Attributes and Relation Graph
,” Preprint arXiv:2012.08514.
46.
Patil
,
A. G.
,
Li
,
M.
,
Fisher
,
M.
,
Savva
,
M.
, and
Zhang
,
H.
,
2021
, “
Layoutgmn: Neural Graph Matching for Structural Layout Similarity
,”
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
,
Montreal, QC, Canada
, pp.
11048
11057
.
47.
Chen
,
J.
,
Qian
,
Y.
, and
Furukawa
,
Y.
,
2022
, “
Heat: Holistic Edge Attention Transformer for Structured Reconstruction
,”
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
,
New Orleans, LA
, pp.
3866
3875
.
48.
Shabani
,
M. A.
,
Hosseini
,
S.
, and
Furukawa
,
Y.
,
2023
, “
Housediffusion: Vector Floorplan Generation via a Diffusion Model With Discrete and Continuous Denoising
,” IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17–24, IEEE, pp.
5466
5475
.
49.
Brockman
,
G.
,
Cheung
,
V.
,
Pettersson
,
L.
,
Schneider
,
J.
,
Schulman
,
J.
,
Tang
,
J.
, and
Zaremba
,
W.
,
2016
, “
‘Openai gym
,” Preprint arXiv:1606.01540.
50.
Vaswani
,
A.
,
Shazeer
,
N.
,
Parmar
,
N.
,
Uszkoreit
,
J.
,
Jones
,
L.
,
Gomez
,
A. N.
,
Kaiser
,
Ł.
, and
Polosukhin
,
I.
,
2017
, “
Attention is All You Need
,”
Adv. Neural Inform. Process. Syst.
,
30
.
51.
Radford
,
A.
,
Wu
,
J.
,
Child
,
R.
,
Luan
,
D.
,
Amodei
,
D.
, and
Sutskever
,
I.
,
2019
, “
Language Models are Unsupervised Multitask Learners
,”
OpenAI Blog
,
1
(
8
), p.
9
.
52.
Heusel
,
M.
,
Ramsauer
,
H.
,
Unterthiner
,
T.
,
Nessler
,
B.
, and
Hochreiter
,
S.
,
2017
, “
Gans Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium
,”
Adv. Neural Inform. Process. Syst.
,
30
.
53.
Giacomello
,
E.
,
Lanzi
,
P. L.
, and
Loiacono
,
D.
,
2019
, “
Searching the Latent Space of a Generative Adversarial Network to Generate Doom Levels
,”
2019 IEEE Conference on Games (CoG)
,
London, UK
,
IEEE
, pp.
1
8
.
You do not currently have access to this content.