Skip to content

Commit 9e662d0

Browse files
authored
Merge pull request #496 from CEED/v0.6changelog
v0.6: Release notes
2 parents bf9f342 + 0696387 commit 9e662d0

File tree

8 files changed

+130
-66
lines changed

8 files changed

+130
-66
lines changed

Doxyfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ PROJECT_NAME = "libCEED"
3838
# could be handy for archiving the generated documentation or if some version
3939
# control system is used.
4040

41-
PROJECT_NUMBER = v0.5
41+
PROJECT_NUMBER = v0.6
4242

4343
# Using the PROJECT_BRIEF tag one can provide an optional one line description
4444
# for a project that appears at the top of each page and should give viewer a

ceed.pc.template

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,6 @@ libdir=${prefix}/lib
44

55
Name: CEED
66
Description: Code for Efficient Extensible Discretization
7-
Version: 0.5
7+
Version: 0.6
88
Cflags: -I${includedir}
99
Libs: -L${libdir} -lceed

doc/sphinx/source/conf.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -249,7 +249,10 @@
249249

250250

251251
# Example configuration for intersphinx: refer to the Python standard library.
252-
intersphinx_mapping = {'https://docs.python.org/': None}
252+
intersphinx_mapping = {
253+
'python': ('https://docs.python.org', None),
254+
'numpy': ('https://numpy.org/devdocs', None),
255+
}
253256

254257

255258
# -- Options for breathe --------------------------------------------------

doc/sphinx/source/intro.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ Introduction
44
Historically, conventional high-order finite element methods were rarely used for
55
industrial problems because the Jacobian rapidly loses sparsity as the order is
66
increased, leading to unaffordable solve times and memory requirements
7-
:cite:`Brown:2010`. This effect typically limited the order of accuracy to at most
7+
:cite:`brown2010`. This effect typically limited the order of accuracy to at most
88
quadratic, especially because they are computationally advantageous in terms of
99
floating point operations (FLOPS) per degree of freedom (DOF)---see
1010
:numref:`fig-assembledVsmatrix-free`---, despite the fast convergence and favorable

doc/sphinx/source/references.bib

Lines changed: 36 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,22 @@
1-
@article{Brown:2010,
1+
@article{arruda1993largestretch,
2+
title={A three-dimensional constitutive model for the large stretch behavior of rubber elastic materials},
3+
author={Arruda, Ellen M and Boyce, Mary C},
4+
journal={Journal of the Mechanics and Physics of Solids},
5+
volume={41},
6+
number={2},
7+
pages={389--412},
8+
year={1993},
9+
publisher={Elsevier}
10+
}
11+
12+
@book{belytschko2013nonlinear,
13+
title={Nonlinear finite elements for continua and structures},
14+
author={Belytschko, Ted and Liu, Wing Kam and Moran, Brian and Elkhodary, Khalil},
15+
year={2013},
16+
publisher={John wiley \& sons}
17+
}
18+
19+
@article{brown2010,
220
Adsurl = {https://doi.org/10.1007/s10915-010-9396-8},
321
Author = {{Brown}, J.},
422
Journal = {Journal of Scientific Computing},
@@ -20,16 +38,13 @@ @article{giraldoetal2010
2038
doi = {10.1137/090775889}
2139
}
2240

23-
@article{straka1993numerical,
24-
title={Numerical solutions of a non-linear density current: A benchmark solution and comparisons},
25-
author={Straka, Jerry M and Wilhelmson, Robert B and Wicker, Louis J and Anderson, John R and Droegemeier, Kelvin K},
26-
journal={International Journal for Numerical Methods in Fluids},
27-
volume={17},
28-
number={1},
29-
pages={1--22},
30-
year={1993},
31-
publisher={Wiley Online Library},
32-
doi={10.1002/fld.1650170103}
41+
@Book{holzapfel2000nonlinear,
42+
author={Holzapfel, Gerhard},
43+
title={Nonlinear solid mechanics: a continuum approach for engineering},
44+
publisher={Wiley},
45+
year={2000},
46+
address={Chichester New York},
47+
isbn={978-0-471-82319-3}
3348
}
3449

3550
@article{hughesetal2010,
@@ -49,21 +64,18 @@ @book{hughes2012finite
4964
publisher={Courier Corporation}
5065
}
5166

52-
@book{belytschko2013nonlinear,
53-
title={Nonlinear finite elements for continua and structures},
54-
author={Belytschko, Ted and Liu, Wing Kam and Moran, Brian and Elkhodary, Khalil},
55-
year={2013},
56-
publisher={John wiley \& sons}
67+
@article{straka1993numerical,
68+
title={Numerical solutions of a non-linear density current: A benchmark solution and comparisons},
69+
author={Straka, Jerry M and Wilhelmson, Robert B and Wicker, Louis J and Anderson, John R and Droegemeier, Kelvin K},
70+
journal={International Journal for Numerical Methods in Fluids},
71+
volume={17},
72+
number={1},
73+
pages={1--22},
74+
year={1993},
75+
publisher={Wiley Online Library},
76+
doi={10.1002/fld.1650170103}
5777
}
5878

59-
@Book{holzapfel2000nonlinear,
60-
author={Holzapfel, Gerhard},
61-
title={Nonlinear solid mechanics: a continuum approach for engineering},
62-
publisher={Wiley},
63-
year={2000},
64-
address={Chichester New York},
65-
isbn={978-0-471-82319-3}
66-
}
6779
@article{williams2009roofline,
6880
title={Roofline: an insightful visual performance model for multicore architectures},
6981
author={Williams, Samuel and Waterman, Andrew and Patterson, David},
@@ -74,13 +86,3 @@ @article{williams2009roofline
7486
year={2009},
7587
publisher={ACM New York, NY, USA}
7688
}
77-
@article{arruda1993largestretch,
78-
title={A three-dimensional constitutive model for the large stretch behavior of rubber elastic materials},
79-
author={Arruda, Ellen M and Boyce, Mary C},
80-
journal={Journal of the Mechanics and Physics of Solids},
81-
volume={41},
82-
number={2},
83-
pages={389--412},
84-
year={1993},
85-
publisher={Elsevier}
86-
}

doc/sphinx/source/releasenotes.rst

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,85 @@ On this page we provide a summary of the main API changes, new features and exam
55
for each release of libCEED.
66

77

8+
.. _v0.6:
9+
10+
v0.6 (Mar 29, 2020)
11+
----------------------------------------
12+
13+
libCEED v0.6 contains numerous new features and examples, as well as expanded
14+
documentation in `this new website <https://libceed.readthedocs.io>`_.
15+
16+
New features
17+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18+
* New Python interface using `CFFI <https://cffi.readthedocs.io/>`_ provides a nearly
19+
1-1 correspondence with the C interface, plus some convenience features. For instance,
20+
data stored in the :cpp:type:`CeedVector` structure are available without copy as
21+
:py:class:`numpy.ndarray`. Short tutorials are provided in
22+
`Binder <https://mybinder.org/v2/gh/CEED/libCEED/master?urlpath=lab/tree/examples/tutorials/>`_.
23+
* Linear QFunctions can be assembled as block-diagonal matrices (per quadrature point,
24+
:cpp:func:`CeedOperatorAssembleLinearQFunction`) or to evaluate the diagonal
25+
(:cpp:func:`CeedOperatorAssembleLinearDiagonal`). These operations are useful for
26+
preconditioning ingredients and are used in the libCEED's multigrid examples.
27+
* The inverse of separable operators can be obtained using
28+
:cpp:func:`CeedOperatorCreateFDMElementInverse` and applied with
29+
:cpp:func:`CeedOperatorApply`. This is a useful preconditioning ingredient,
30+
especially for Laplacians and related operators.
31+
* New functions: :cpp:func:`CeedVectorNorm`, :cpp:func:`CeedOperatorApplyAdd`,
32+
:cpp:func:`CeedQFunctionView`, :cpp:func:`CeedOperatorView`.
33+
* Make public accessors for various attributes to facilitate writing composable code.
34+
* New backend: ``/cpu/self/memcheck/serial``.
35+
* QFunctions using variable-length array (VLA) pointer constructs can be used with CUDA
36+
backends. (Single source is coming soon for OCCA backends.)
37+
* Fix some missing edge cases in CUDA backend.
38+
39+
Performance Improvements
40+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
41+
* MAGMA backend performance optimization and non-tensor bases.
42+
* No-copy optimization in :cpp:func:`CeedOperatorApply`.
43+
44+
Interface changes
45+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
46+
* Replace :code:`CeedElemRestrictionCreateIdentity` and
47+
:code:`CeedElemRestrictionCreateBlocked` with more flexible
48+
:cpp:func:`CeedElemRestrictionCreateStrided` and
49+
:cpp:func:`CeedElemRestrictionCreateBlockedStrided`.
50+
* Add arguments to :cpp:func:`CeedQFunctionCreateIdentity`.
51+
* Replace ambiguous uses of :cpp:enum:`CeedTransposeMode` for L-vector identification
52+
with :cpp:enum:`CeedInterlaceMode`. This is now an attribute of the
53+
:cpp:type:`CeedElemRestriction` (see :cpp:func:`CeedElemRestrictionCreate`) and no
54+
longer passed as ``lmode`` arguments to :cpp:func:`CeedOperatorSetField` and
55+
:cpp:func:`CeedElemRestrictionApply`.
56+
57+
Examples
58+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
59+
60+
libCEED-0.6 contains greatly expanded examples with :ref:`new documentation <Examples>`.
61+
Notable additions include:
62+
63+
* Standalone :ref:`ex2-surface` (:file:`examples/ceed/ex2-surface`): compute the area of
64+
a domain in 1, 2, and 3 dimensions by applying a Laplacian.
65+
* PETSc :ref:`example-petsc-area` (:file:`examples/petsc/area.c`): computes surface area
66+
of domains (like the cube and sphere) by direct integration on a surface mesh;
67+
demonstrates geometric dimension different from topological dimension.
68+
* PETSc :ref:`example-petsc-bps`:
69+
70+
* :file:`examples/petsc/bpsraw.c` (formerly ``bps.c``): transparent CUDA support.
71+
* :file:`examples/petsc/bps.c` (formerly ``bpsdmplex.c``): performance improvements
72+
and transparent CUDA support.
73+
* :ref:`example-petsc-bps-sphere` (:file:`examples/petsc/bpssphere.c`):
74+
generalizations of all CEED BPs to the surface of the sphere; demonstrates geometric
75+
dimension different from topological dimension.
76+
77+
* :ref:`example-petsc-multigrid` (:file:`examples/petsc/multigrid.c`): new p-multigrid
78+
solver with algebraic multigrid coarse solve.
79+
* :ref:`example-petsc-navier-stokes` (:file:`examples/fluids/navierstokes.c`; formerly
80+
``examples/navier-stokes``): unstructured grid support (using PETSc's ``DMPlex``),
81+
implicit time integration, SU/SUPG stabilization, free-slip boundary conditions, and
82+
quasi-2D computational domain support.
83+
* :ref:`example-petsc-elasticity` (:file:`examples/solids/elasticity.c`): new solver for
84+
linear elasticity, small-strain hyperelasticity, and globalized finite-strain
85+
hyperelasticity using p-multigrid with algebraic multigrid coarse solve.
86+
887
.. _v0.5:
988

1089
v0.5 (Sep 18, 2019)

examples/ceed/index.rst

Lines changed: 7 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -50,37 +50,17 @@ Similarly to :ref:`Ex1-Volume`, it computes:
5050
I = \int_{\partial \Omega} \mathbf{1} \, dS .
5151
:label: eq-ex2-surface
5252
53-
but this time by solving a Laplace's equation for a harmonic function
54-
:math:`u(\mathbf{x})`. We write the Laplace's equation
53+
but this time by applying the divergence theorem using a Laplacian.
54+
In particular, we select :math:`u(\bm x) = x_0 + x_1 + x_2`, for which :math:`\nabla u = [1, 1, 1]^T`, and thus :math:`\nabla u \cdot \hat{\bm n} = 1`.
5555

56-
.. math::
57-
\nabla \cdot \nabla u = 0, \textrm{ for } \mathbf{x} \in \Omega .
58-
:label: eq-laplace
59-
60-
We can rewrite this via the bilinear form :math:`a(\cdot, \cdot)` and the linear form
61-
:math:`\langle \cdot, \cdot \rangle` as
56+
Given Laplace's equation,
6257

6358
.. math::
64-
a(u,v) = \langle, v,f \rangle
59+
-\nabla \cdot \nabla u = 0, \textrm{ for } \mathbf{x} \in \Omega
6560
66-
where :math:`v` is the test function, and for which :math:`\langle, v,f \rangle=0` in
67-
this case. We
68-
obtain
61+
multiply by a test function :math:`v` and integrate by parts to obtain
6962

7063
.. math::
71-
a(u,v) = \int_\Omega v \nabla \cdot \nabla u \, dV = \int_{\partial \Omega} v \nabla u \cdot \mathbf{n}\, dS - \int_\Omega \nabla v \cdot \nabla u \, dV = 0 ,
72-
73-
where we have used integration by parts.
64+
\int_\Omega \nabla v \cdot \nabla u \, dV - \int_{\partial \Omega} v \nabla u \cdot \hat{\bm n}\, dS = 0 .
7465
75-
:math:`a(u,v) = 0` because we have chosen :math:`u(\mathbf{x})` to be harmonic, so we
76-
can write
77-
78-
.. math::
79-
\int_{\partial \Omega} v \nabla u \cdot \mathbf{n}\, dS = \int_\Omega \nabla v \cdot \nabla u \, dV
80-
:label: eq-laplace-by-parts
81-
82-
and use the :ref:`CeedOperator` for Laplace's operator to compute the right-hand side of
83-
equation :math:numref:`eq-laplace-by-parts`. This way, the left-hand side of equation
84-
:math:numref:`eq-laplace-by-parts` (which gives :math:numref:`eq-ex2-surface` because
85-
we have chosen :math:`u(\mathbf{x}) = (x + y + z)` such that
86-
:math:`\nabla u \cdot \mathbf{n} = 1`) is readily found.
66+
Since we have chosen :math:`u` such that the boundary integrand is :math:`v 1`, we may evaluate the surface integral by applying the volumetric Laplacian and summing the result.

examples/solids/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -458,7 +458,7 @@ That is, given the linearization point :math:`\bm F` and solution increment :mat
458458
#. conclude by :math:numref:`eq-diff-P`, where :math:`\bm S` is either stored or recomputed from its definition exactly as in the nonlinear residual evaluation.
459459

460460
.. note::
461-
The decision of whether to recompute or store functions of the current state :math:`\bm F` depends on a roofline analysis :cite:`williams2009roofline,Brown:2010` of the computation and the cost of the constitutive model.
461+
The decision of whether to recompute or store functions of the current state :math:`\bm F` depends on a roofline analysis :cite:`williams2009roofline,brown2010` of the computation and the cost of the constitutive model.
462462
For low-order elements where flops tend to be in surplus relative to memory bandwidth, recomputation is likely to be preferable, where as the opposite may be true for high-order elements.
463463
Similarly, analysis with a simple constitutive model may see better performance while storing little or nothing while an expensive model such as Arruda-Boyce :cite:`arruda1993largestretch`, which contains many special functions, may be faster when using more storage to avoid recomputation.
464464
In the case where complete linearization is preferred, note the symmetry :math:`\mathsf C_{IJKL} = \mathsf C_{KLIJ}` evident in :math:numref:`eq-neo-hookean-incremental-stress-index`, thus :math:`\mathsf C` can be stored as a symmetric :math:`6\times 6` matrix, which has 21 unique entries.

0 commit comments

Comments
 (0)