Grain Boundary Motion 4
Bill has corrected the boundary conditions so let's restate the problem. We have (the first equation is missing due to some bug or other in trac).
with a jump at , but both and are continuous. The jump condition for the first equation is,
This works perfectly as the equation can now be written
in the finite volume formulation. The issue is how to provide a boundary condition for the second equation with the correct form for the jump condition. Define to be the rotation of the grain boundary and the displacement angle of the upper surfaces in the frame of reference where the grain boundary is vertical. The following holds,
`
Essentially, we have to eliminate the angles for an expression for only the and the surface energies. Easy enough to do, but what we really want is a jump condition that fits with equation 2. Ideally this would be of the form.
where is a constant. This isn't possible with the above expression involving the angles and grain boundaries. It is quite obvious that
is just the jump in where is an arc length. This condition requires the vertical displacement from the horizontal and hence knowledge of the orientation of the grain boundary. In terms of the angles,
From a practical point of view, I suppose we can deal with this boundary conditions simply be iterating at each time step. However, there seems to be something deeply unsatisfying about it.
Grain Boundarly Motion 4
Bill, Has corrected the boundary conditions so let's restate the problem. We have:
with a jump at , but both and are continuous. The jump condition for the first equation is,
Grain Boundary Motion 3
The solution for grain boundary motion with translation is given by,
with
where
and
The comparison with the 1D numerical solution seems to be quite good. The numbers are , , , , and .
The code for this is source:trunk/boettinger/grainBoundary/gb.py@1561.
Grain Boundary Motion 2
In blog:GrainBoundaryMotion the calculation of the analytical solution was broken. The complete analytical solution including the power series is
where
and
The power series are given by,
with
and
with
The following image is generated using revision 1559 (source:trunk/boettinger/grainBoundary/gbStationary.py@1559) for , , , and .
Grain Boundary Motion
Bill has a simple 1D model for grain boundary motion and hillock formation. The governing equation is given by
where
with two jump conditions at .
We can include these conditions in the main equations if we write the equations in the weak form:
and
The fipy script is in source:trunk/boettinger/grainBoundary/gb.py.
Version 1551 (source:trunk/boettinger/grainBoundary/gb.py@1551) compares against an analytical solution given by:
where
where
The solution is good for small M in the vicinity of the groove and .
For values of , and with a box size of 40 and 200 grid points. The analytical and numerical solutions are in good agreement in the location of the groove at t=10.
Another test is with an analytical solution for . This looks like
where
and
The code for this comparison is source:trunk/boettinger/grainBoundary/gb.py@1553.
The image below is for at t=10 using the first 14 terms in the power series.
The analytical solution does a good job in the region of the groove, but doesn't rise above as it does when . Obviously, the power series is not much good if the argument is much above 1. If we use this as a boundary then we get at , which seems to correspond with the domain where the solutions are the same.
Notes on Light Propogation using the Eikonal
I want to try and figure out whether it is possible to solve the eikonal equation for a complex function of the wave fronts to simulate absorption of light. Here, we will recap assuming the wave front function is real. The wave equation for the electric field is given by,
where we will assume an electric field intensity of
where
If we assume a short wavelength than the wave equation reduces to
The time averaged Poynting vector is given by,
and the intensity is given by,
Electrical problem including metal diffusion
This post includes the effects of the counter electrode and the metal diffusion. The metal diffusion is included via a solution where the metal gradient doesn't change rapidly at the interface. This is not really very physical, but it will be useful when testing numerical solutions.
At the working electrode :
At the counter electrode :
where
The normals point out of the electrolyte.
In the bulk:
and
where and are the distances from the counter and working electrodes to the reference electode, respectively. is the total distance between the electrodes.
The metal diffusion has an analytical solution
where
This solution is valid when . This leads to two dimensionless quantities that must be small,
and
We can now derive an expression for .
Using the above solution we can derive a new expression for .
If we do this the equations can be reduced to two coupled ODEs for and :
and
The material parameters are:
Integrating this gives:
The code for this is source:trunk/moffat/electrical/misc/1D_metal.py.
The evolution of the :
and the cupric concentrations at various times:
Symmetric Electrical Problem
Extending blog:MoffatElectrical1 by making the system totally symmetric
The changes are:
14c14 < alpha = 0.5 --- > alpha = 0.4 30c30 < return i0 * (numpy.exp(-alpha * Fbar * V) - numpy.exp((1 - alpha) * Fbar * V)) --- > return i0 * (numpy.exp(-alpha * Fbar * V) - numpy.exp((2 - alpha) * Fbar * V)) 34c34 < - (1 - alpha) * Fbar * numpy.exp((1 - alpha) * Fbar * V)) --- > - (2 - alpha) * Fbar * numpy.exp((2 - alpha) * Fbar * V))
gives:
The code for this is source:trunk/moffat/electrical/misc/1D_symmetric.py.
The evolution of the :
Electrical Problem including counter electrode
Including the effects of the counter electrode.
At the working electrode :
At the counter electrode :
The normals point out of the electrolyte.
In the bulk:
and
where and are the distances from the counter and working electrodes to the reference electode, respectively. is the total distance between the electrodes.
If we do this the equations can be reduced to two coupled ODEs for and :
and
The material parameters are:
Integrating this gives:
The code for this is source:trunk/moffat/electrical/misc/1D_counter.py.
The evolution of the :
A few changes to Bill's code
Bill wants me to make four changes to his code, see source:trunk/kirkendall/A-V_phase-field.py@1457.
He needs to make four changes.
- Update to be an expression involving . This is done in r1458.
- The second item is to check source:trunk/kirkendall/A-V_phase-field.py@1457#L71 against Bill's notes. It seems correct other than that he left out the diffusion term in his notes by mistake.
- Save the data from the simulations for later use and read the saved files and plot the results independently from the simulations. See diff:@1458:1460. This now writes a data file directory of the form data/A-V_phase-field_17_36_53Thu02Feb2012. The data files are stored as numpy zip files .npz and numbered by their time step value. plotdata.py reads and plots from the latest data directory created.
The name of an alternative path can be specified as an argument
$ python plotdata.py data/A-V_phase-field_17_33_42Thu02Feb2012
This will use the specified directory, while
$ python plotdata.py
will use the latest created data directory. plotdata.py plots the data for the four main fields using the stored data.
- Add an improved method for extracting the 0.5 level set. See r1461. This is done using scipy's interp1d. After plotting the four main fields for every stored timestep, plotdata.py will throw up a plot of the interface position.
Light Propogation
Jon is interested in adding a better light calculation to his PV code. At the moment the Beer-Lambert law/equation is being used for light. Firstly we need to understand what is wrong with the current technique. We've been discussing the use of the level set method / fast marching method as an alternative to using FiPy's basic first order upwind scheme, which is quite inefficient and is not working for some problems. Some more thoughts:
- We don't have a second order accurate implicit scheme in FiPy.
- Would using a second order explicit scheme be very inefficient. It is a one time cost since it is static, but it could get prohibitively bad for the sort of meshes Jon may need for accurate solutions.
- FFM will be way more efficient and we have second order schemes available, but not on unstructured meshes.
- Can we even solve the Beer-Lambert equation using the Eikonal equation?
- We need to understand what is going wrong with the current solution that Jon showed me in 1D and why it seems to deviate form the analytical. I believe that it is grid spacing.
Given the Beer-Lambert equation,
let's try and solve a 1D problem with a shock using implicit upwind. See trunk/misc/light.py. With a fixed grid things seem quite good even though this is a 1D scheme. Not sure if this carries over to 1D. Basically the velocity goes from 10 to 1 half way across the domain and we see a shock as expected ( is constant). The results are not so bad.
The upwind implicit scheme can deal with shocks (at least in 1D). Let's observe how it deals with a large change in the mesh density (see trunk/misc/lightVariableMesh.py). Let's try a 100 fold change. A small jump occurs.
A ten fold increase. Seems fine.
A thousand fold increase. Big jump.
The jumps are not surprising and consistent with a first order scheme. The tail has the right shape even though it has a large displacement. The PV 1D result did not have a tail that modeled the local solution correctly. Maybe that is to do with a gradient of changes rather than 1 jump in the mesh density.
Adding a gradual change example trunk/misc/lightVariableMeshGradual.py shows this
This could be the issue with the PV code.
Tom's Electrical Problem
Tom has come up with a new problem that involves solving the electric field. Here are the equations at the electrode interface ():
and in the bulk:
on the top electrode ():
At first glance, I didn't see that this system is closed. Since the equation is steady state, and surrounded by Neumann boundary conditions it only needs one boundary to fix the value (it is an initial value problem for the slope of rather than a boundary value problem for .
Solving for the 1D problem
We have (assuming the normal points out of the electrolyte)
where is the distance between the electrodes
This leads to an ODE for
This can be solved with scipy's ODE integrator. To do this I took an example <http://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.ode.html> and used that. The code is source:trunk/moffat/electrical/1D.py.
The results are
and
With the following numbers:
We get the following steady state values:
Installing Samrai
In order to use lsmlib in parallel, I need to install SAMRAI. Of course this is easier said than done. I downloaded Samrai 2.4.4 and did the following on bunter:
$ mkdir build $ cd build $ ../configure --prefix=/users/wd15/.virtualenvs/lsmlib --with-F77=/usr/bin/gfortran checking build system type... x86_64-unknown-linux-gnu ... config.status: executing default commands $ make make NDIM=1 compile || exit 1 ... g++ -g -Wall -fno-implicit-templates -I. -I../../../include -I../../../../include -I/usr/lib/openmpi/include -c ../../../../source/toolbox/restartdb/HDFDatabaseFactory.C -o HDFDatabaseFactory.o ../../../../source/toolbox/restartdb/HDFDatabaseFactory.C: In member function ‘virtual SAMRAI::tbox::Pointer<SAMRAI::tbox::Database> SAMRAI::tbox::HDFDatabaseFactory::allocate(const std::string&)’: ../../../../source/toolbox/restartdb/HDFDatabaseFactory.C:20: error: ‘HDFDatabase’ was not declared in this scope ../../../../source/toolbox/restartdb/HDFDatabaseFactory.C:20: error: template argument 1 is invalid ../../../../source/toolbox/restartdb/HDFDatabaseFactory.C:20: error: invalid type in declaration before ‘=’ token ../../../../source/toolbox/restartdb/HDFDatabaseFactory.C:20: error: expected type-specifier before ‘HDFDatabase’ ../../../../source/toolbox/restartdb/HDFDatabaseFactory.C:20: error: invalid conversion from ‘int*’ to ‘int’ ../../../../source/toolbox/restartdb/HDFDatabaseFactory.C:20: error: expected ‘,’ or ‘;’ before ‘HDFDatabase’ ../../../../source/toolbox/restartdb/HDFDatabaseFactory.C:21: error: invalid conversion from ‘int’ to ‘SAMRAI::tbox::Database*’ ../../../../source/toolbox/restartdb/HDFDatabaseFactory.C:21: error: initializing argument 1 of ‘SAMRAI::tbox::Pointer<TYPE>::Pointer(TYPE*, bool) [with TYPE = SAMRAI::tbox::Database]’ make[4]: *** [HDFDatabaseFactory.o] Error 1 make[4]: Leaving directory `/users/wd15/packages/SAMRAI/build/source/toolbox/restartdb' make[3]: *** [libXd] Error 1 make[3]: Leaving directory `/users/wd15/packages/SAMRAI/build/source/toolbox' make[2]: *** [libXd] Error 1 make[2]: Leaving directory `/users/wd15/packages/SAMRAI/build/source' make[1]: *** [compile] Error 1 make[1]: Leaving directory `/users/wd15/packages/SAMRAI/build' make: *** [library] Error 1
So that didn't work. I had the following exchange with someone on the SAMRAI list. I wrote
Hi, I'm trying to build SAMRAI version 2.4.4. The configure step seems to have no issues, but the compilation step throws the following error. $ make make NDIM=1 compile || exit 1 make[1]: Entering directory `/users/wd15/Documents/python/SAMRAI' (cd source && make libXd) || exit 1 ... ... ... g++ -g -Wall -fno-implicit-templates -I. -I../../../include -I../../../include -I/usr/lib/openmpi/include -c HDFDatabaseFactory.C -o HDFDatabaseFactory.o HDFDatabaseFactory.C: In member function ‘virtual SAMRAI::tbox::Pointer<SAMRAI::tbox::Database> SAMRAI::tbox::HDFDatabaseFactory::allocate(const std::string&)’: HDFDatabaseFactory.C:20: error: ‘HDFDatabase’ was not declared in this scope HDFDatabaseFactory.C:20: error: template argument 1 is invalid HDFDatabaseFactory.C:20: error: invalid type in declaration before ‘=’ token HDFDatabaseFactory.C:20: error: expected type-specifier before ‘HDFDatabase’ HDFDatabaseFactory.C:20: error: invalid conversion from ‘int*’ to ‘int’ HDFDatabaseFactory.C:20: error: expected ‘,’ or ‘;’ before ‘HDFDatabase’ HDFDatabaseFactory.C:21: error: invalid conversion from ‘int’ to ‘SAMRAI::tbox::Database*’ HDFDatabaseFactory.C:21: error: initializing argument 1 of ‘SAMRAI::tbox::Pointer<TYPE>::Pointer(TYPE*, bool) [with TYPE = SAMRAI::tbox::Database]’ make[4]: *** [HDFDatabaseFactory.o] Error 1 ... ... ... make[1]: *** [compile] Error 1 make[1]: Leaving directory `/users/wd15/Documents/python/SAMRAI' make: *** [library] Error 1 Any ideas what the issue may be? Other details: $ uname -a Linux bunter 2.6.32-5-amd64 #1 SMP Mon Oct 3 03:59:20 UTC 2011 x86_64 GNU/Linux $ lsb_release -a No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 6.0.3 (squeeze) Release: 6.0.3 Codename: squeeze All other packages are at standard versions for Debian Squeeze including the HDF5 libraries. Thanks
and I got the reply
Daniel, HDFDatabase.h is guarded with a #ifdef HAVE_HDF5. From the error message it appears that HAVE_HDF5 is not defined. Grep for HAVE_HDF5 in SAMRAI_config.h to see if it is defined. I suspect that something went wrong during configuration and you are not getting HDF5. Another clue that this is likely the problem is that there is no hdf5 include directory on the compile line. Let us know what you find. -- Bill Arrighi Lawrence Livermore National Lab
so I tried a new configure step to include hdf5
$../configure --prefix=/users/wd15/.virtualenvs/lsmlib --with-F77=/usr/bin/gfortran --with-hdf5 checking build system type... x86_64-unknown-linux-gnu ... checking for HDF5 installation... /usr checking whether HDF5 link works... no configure: error: HDF5 compile/link test failed
Andrew suggested debugging the configure script with the -x flag. The "set -x" was places on line 19422. Any way it showed that the configure script is trying to build and link the following c++ file (conftest.cppa):
#include "hdf5.h" #define FILE "file.h5" int main() { hid_t file_id; /* file identifier */ herr_t status; /* Create a new file using default properties. */ file_id = H5Fcreate(FILE, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT); /* Terminate access to the file. */ status = H5Fclose(file_id); }
with
$ g++ -o conftest -g -O2 conftest.cpp -lz -ldl -lm -lz -lm /tmp/ccn31txn.o: In function `main': /users/wd15/packages/SAMRAI/build/conftest.cpp:10: undefined reference to `H5check_version' /users/wd15/packages/SAMRAI/build/conftest.cpp:10: undefined reference to `H5Fcreate' /users/wd15/packages/SAMRAI/build/conftest.cpp:13: undefined reference to `H5Fclose' collect2: ld returned 1 exit status
if I do
$ g++ -o conftest -g -O2 conftest.cpp -lz -ldl -lm -lz -lm -lhdf5 $ ./conftest
it works fine. Looking at the configure code, it looks like it wants a directory for --with-hdf5:
../configure --prefix=/users/wd15/.virtualenvs/lsmlib --with-F77=/usr/bin/gfortran --with-hdf5=/usr
gets through the config without issue. However make fails with loads of
g++ -g -Wall -fno-implicit-templates -I. -I../../../include -I../../../../include -I/usr/include -I/usr/lib/openmpi/include -c ../../../../source/toolbox/restartdb/HDFDatabase.C -o HDFDatabase.o /usr/include/H5Gpublic.h: In static member function ‘static herr_t SAMRAI::tbox::HDFDatabase::iterateKeys(hid_t, const char*, void*)’: /usr/include/H5Gpublic.h:148: error: too many arguments to function ‘hid_t H5Gopen1(hid_t, const char*)’ ../../../../source/toolbox/restartdb/HDFDatabase.C:160: error: at this point in file /usr/include/H5Gpublic.h:148: error: too many arguments to function ‘hid_t H5Gopen1(hid_t, const char*)’ ../../../../source/toolbox/restartdb/HDFDatabase.C:180: error: at this point in file ../../../../source/toolbox/restartdb/HDFDatabase.C:204: error: invalid conversion from ‘herr_t (**)(hid_t, void*)’ to ‘void**’ /usr/include/H5Epublic.h:212: error: too many arguments to function ‘herr_t H5Eget_auto1(herr_t (**)(void*), void**)’ ../../../../source/toolbox/restartdb/HDFDatabase.C:204: error: at this point in file /usr/include/H5Epublic.h:216: error: too many arguments to function ‘herr_t H5Eset_auto1(herr_t (*)(void*), void*)’ ... make[4]: *** [HDFDatabase.o] Error 1 make[4]: Leaving directory `/users/wd15/packages/SAMRAI/build/source/toolbox/restartdb' make[3]: *** [libXd] Error 1 make[3]: Leaving directory `/users/wd15/packages/SAMRAI/build/source/toolbox' make[2]: *** [libXd] Error 1 make[2]: Leaving directory `/users/wd15/packages/SAMRAI/build/source' make[1]: *** [compile] Error 1 make[1]: Leaving directory `/users/wd15/packages/SAMRAI/build' make: *** [library] Error 1
python setup.py develop issues
If python setup.py develop does not seem to be updating easy-install.pth correctly, then try running pip install setuptools --upgrade and rerunning python setup.py develop.
Installing Ubuntu on the Acer Aspire One Netbook
I purchased the Aspire One at Costco for $300 with the hope that Ubuntu would install smoothly without glitches. Unsurprisingly, it wasn't a straight forward experience. The first problem is that the monitor aspect ratio was wrong. All text was squashed in the vertical plane. Luckily, the first 11.04 update fixed these problems. It also prompted me to install two proprietary drivers, which I did. Before the drivers were installed I couldn't load Unity. After they were installed I could load Unity.
At first this all seemed fine apart from a weird AMD watermark in the bottom right hand corner of the screen. I did the googles and found the following, which fixed the issue,
Totally gross, but it works.
This all seemed totally fine, but after awhile the computer just started to freeze and needed to be rebooted. This turned out to be harder to fix in terms of finding answers. I believe that it is somehow connected to the wireless as it always seems to happen when the wireless is starting up, but who knows. Anyway I found this helpful post.
<http://ubuntuforums.org/showpost.php?p=10571078&postcount=16>
This worked (thank god). I had to boot into safe mode in order to have enough time to edit the necessary file. After this things all seem to be fine other than Unity not working. I'm not too bothered about that because I'm perfectly happy with classic mode. In fact classic seems a bit snappier. I'm not really sure about Unity anyway (what does it really add?).
Overall, compared with my now defunct Dell mini, it's a lot faster. Firefox works properly, which it didn't on the mini. I'm a happy bunny once more for $300 and some head scratching.
Generating Passwords
pwgen -Bcny 12 1
Making lam work across multiple machines
To make lam work across multiple machine take the following steps:
- Create a bhost file
bunter.nist.gov cpu=4 hobson.nist.gov cpu=4 alice.nist.gov cpu=4
- $ export LAMRSH=ssh
- $ lamboot -ssi boot_rsh_ignore_stderr 1 bhost
- $ mpirun -np 12 python script.py
This should then work transparently across the three machines. The stdout streams from the other machines didn't appear on the screen, but they were certainly running in parallel.
mpi4py, trilinos, lam and i686
We needed both trilinos amd mpi4py working against lam on the i686 machines so we can test effectively and run parallel jobs unopposed. I built mpi4py on bunter by
- making sure absolutely everything with mpi4py in it was removed from site-packages
- Adding
[i686-lam] mpi_dir = /usr/lib/lam mpicc = /usr/bin/mpicc.lam mpicxx = /usr/bin/mpic++.lam
to mpi.cfg and running Python setup.py build --mpi=i686-lam in the i686-lam virtualenv.
I built trilinos using i686-lam with
../configure CXX=mpic++.lam CC=mpicc.lam F77=mpif77.lam CXXFLAGS=-O3 CFLAGS=-O3 FFLAGS="-O5 -funroll-all-loops -malign-double" --enable-epetra --enable-aztecoo --enable-pytrilinos --enable-ml --enable-ifpack --enable-amesos --with-gnumake --enable-galeri --enable-mpi --with-mpi=/usr/lib/lam --cache-file=config.cache --prefix=/users/wd15/.virtualenvs/i686-lam
Problem in virtualenvwrapper
I've had some issues with virtualenvwrapper. Whenever I start up a new shell, the shell hangs when virtualenvwrapper is installed. I debugged this by putting a "set -x" at the top of "virtualenvwrapper.sh". "virtualenvwrapper.sh" is sourced from ".bashrc". Anyway, the debug information pointed the finger at line 193 of "virtualenvwrapper.sh"
fi trap "rm '$file' >/dev/null 2>&1" EXIT echo $file
It hangs on the "rm" command as my "rm" command is aliased to "rm -i". I am not sure what the right thing to is. For now I have just set
fi trap "\rm '$file' >/dev/null 2>&1" EXIT echo $file
which deals with the issue. Is the problem with "virtualenvwrapper.sh" because it is assuming that "rm" is the default version or should my aliases come last in the .bashrc file because commands should be the default during the execution of the "rc" type files?
Bill's changes to paper April 6 2010
Stuff that I am not sure about:
- Think hard about the entropy stuff in the abstract
- comment about what people believe regarding the viscosity of metals
Convergence
There are three simulations that need to be compared to test convergence 267 (250x360), 271 (500x700) and 272 (125x180). Simulation 272 may not work, I can't remember whether this is an adequate size. I may need to do an even larger simulation, but will need access to more CPUs for that.
Entropy Production
Given the free energy per unit volume, the entropy production for our problem should be,
where the and are summed. I think the entropy production for the last two terms is correct. Right now, I'm confused by the first term.
Simulations with van Leer convection term
The upwind convection term appeared to help the low concentration simulation work. However, the upwind convection term is inaccurate though. Let's use the van Leer convection term instead.
ID | Notes | Issues | Status | movie | |
267 | base | none | benson, hobson, alice (stopped) | 12 | |
268 | none | renfield, sancho, kato (stopped) | 12 | ||
269 | crashes | luggage (stopped) | 12 | ||
270 | none | luggage (stopped) | 12 |
Simulations with a negligible radial displacement
The radial displacement issue has been troubling me. Most of the simulations have had a radial displacement of dr / 10 and some others have dr. Some simulations don't run at all and an instability can clearly be seen to emanate from the r=0 location. I tested the base simulation with dr * 1e-10 just to figure out why I had trouble in the past. Needless to say the simulation seem to be running fine. I have just launched some more simulations with dr * 1e-10 to try and diagnose.
ID | Notes | Issues | Status | movie | |
264 | base | None | benson, hobson, alice (stopped) | 12 | |
265 | worked with upwind convection term | renfield, sancho, kato (stopped) | 12 | ||
266 | crashed | luggage (stopped) | 12 |
Useful Simulations
ID | Notes | diff | Issues | Status | movie | |
222 | base | diff | benson, hobson, alice (stopped) | 12 | ||
225 | diff | luggage (stopped) | 12 | |||
226 | diff | crashed | renfield, sancho, kato (stopped) | 12 | ||
233 | 222: | diff | cluster 7.0 (stopped) | 4 | ||
240 | 222: | diff | crashed | luggage (stopped) | 12 | |
252 | 243: , shift=dx, | diff | cluster 128 (stopped) | 4 | ||
257 | 251: , , shift=2*dx, | diff | luggage (stopped) | 12 |
Edwin Schwalbach Model
For the last three days Eddie and I have been working on his VLS growth model. We've been trying to implement it in FiPy?. We have successfully implemented a two phase, pure, 1D system. His notes are here
The code is here
Essentially, this is simply the van der Waals model with a free energy of the form
We implement the model using the block matrix approach with , and as the base variables to "solve for". The is linearized with,
where,
Remember,
and
and
Explicitly,
and
Source Terms
Take an equation say,
In general can be implemented explicitly (i.e. where is the previous sweep value. Now, to implement this implicitly, we need to linearize,
or
where
In general for stability reasons, though one can often relax this somewhat if the source is not dominant. If the source is not negative, then should be evaluated entirely explicitly. The reason must be negative is the following. Take,
then the discretized implicit form is
Say is positive, then this equation should increase the value of , but if then we get a negative value of . If is negative, there are no stability criterion. Conversely, if then the explicit scheme,
will drive the solution below for sufficiently negative . For very gentle intro,
Pressure velocity coupling
We use the Rhie-Chow interpolation method to deal with this issue since and are not on staggered grids but collocated. Essentially, the Rhie-Chow interpolation scheme corrects the face velocities in the continuity equation by the use of a corrected velocity that approximates solving the momentum equation on the faces. Let's take the discretized form of the x-momentum equation:
where is a cell value. For small enough grid, we would like
that is to say that we would like the residual of the momentum equation evaluated at the face to be equal to the residual of the averaged residual over the neighboring cells. We assume that the terms are unimportant and say that,
We approximate and the we get,
The continuity equation can be written,
Substituting in gives a diffusion term for , which acts to stabilize the solution. See source:trunk/vls/input.py@1031#L78.
Parsitic Currents
Discretize using rather than . This induces dissipation assoicated with momentum. See source:trunk/reactiveWetting/paper/jacqmin.pdf@1031.
Movie of solution to equations
Hydrodynamic limit with elevated interface diffusion
Increasing certainly makes a big difference.
The image shows the drop radius against time. The inertial time scale scales time and initial radius of the drop scales the spreading distance . There is an order of magnitude change in time between when the spreading occurs for the old hydrodynamic case and the new ones. I will need to plot the spreading rate curves to see how this knocks the solution away from Spelt. Also, there seems to be some difference between and . Note that . The interface diffusion time scale is so we probably need to raise the interface diffusion coefficient by two orders from the fluid value to get it be less than the inertial time scale. If indeed that does matter. It certainly seems to.
Interface Diffusion
The following figure shows melting rates for various with the sharp interpolation scheme.
Firstly, the melting rates agree well with analytical rates. Secondly, as pointed out in the previous post, does not have much influence on the melting. It has some influence at early times, which is evidently related to hydrodynamics. By about , the curves for have almost all collapsed onto each other.
Interpolation Schemes for M
Carrying on from blog:SolidDiffusion1 it is clear that we have to modify the interpolation scheme. It showed that the time scale in the interface has to be fast enough to not influence the bulk diffusion properties. It is clear that for the purposes of diffusion that having the interface diffusion coefficient equal to the bulk diffusion is adequate. However, it may be necessary for hydrodynamic reasons to have faster diffusion in the interface.
In this figure refers to
and refers to
where
This red curve demonstrates that allowing the interface to have the fluid diffusion coefficient all the way through gets us back to the bulk limited value. It also demonstrates that the based interpolation scheme doesn't really work because it over influences the diffusion in the bulk and allows it to have much faster diffusion than it otherwise should do. This is due to the extent that extends out beyond the interface region. As shall be seen, a sharp cut off at ameliorates this issue.
As shall be seen in the subsequent post, the black line is actually agrees with the analytical solution and is bulk limited. It is also clear that as goes from 2 to 1 the melting is faster due to this issue.
Analytic Melting Solution
The following os an analytical melting solution from Bill and Geoff
where is the distance the interface has moved. is given by solving
and is given by,
In this case .
Influence of Solid Diffusion
The 1D simulations have demonstrated that the solid diffusion coefficient has a major impact on the rate of melting. The following image shows melting with being varied.
It is clear that has a considerable impact on the melting. This is due to the limiting nature of the interface diffusion. The interpolation scheme is
which drives the value of the interface diffusion down a lot in the solid. The system is bulk limited as approaches as can be seen from the black dashed line. We though that the system was limited by the phase filed motion, but in fact it obviously isn't. The time scale for the phase field equation is given by,
which is faster than the bulk diffusion time scale even when ,
where is the distance moved which I have chosen such that . A reasonable choice in this case. Hardly surprising that this system is bulk limited.
So why does the solid diffusion eventually slow things down? Maybe we could say that the interface has to have reached some sort of equilibration before the motion occurs. This would lead to a time scale on the order of,
This is probably too fast, it probably also has a prefactor like the in the bulk. However, it should be faster than the bulk kinetics. This seems obvious now for some reason. Anyway, it makes sense that we need to reduce the solid diffusion by three orders of magnitude before the interface kinetics start to become slower than the bulk kinetics.
Time scales asccoiated with diffusion
Having done a large number of 1D simulations it has become apparent that the comments in blog:SolidDiffusion entry seem to hit the nail on the head and are probably correct. The following image shows the liquid fraction from a number of 1D simulations that start with an system that has a 1/3 of each phase, but with a low concentration is the liquid. The liquid should expand to 1.12 of its original volume.
Clearly, the solid diffusion coefficient has no effect on the melting as long as it is above a given threshold. This agrees with our earlier time scale analysis in the previous blog entry. In the following image the interpolation scheme is varied. It is clear that if the fluid diffusion is used in the interface the solid diffusion coefficient does not effect the melting.
Given the sharp interpolation scheme, is it still possible to have interface effects. Well, maybe, if we could be in for some problems. To avoid this:
For the numbers that we are using, the expression on the right hand side is approximately 0.01. That means we are in good shape for the 2D simulations from the perspective of not being interface limited for melting.
However, the absolute value for is , which is a lot slower than many of the other time scales in the system. The inertial time scale for example is two orders of magnitude faster. This means that we are in extremely bad shape since interface effects are still way slower than the spreading. Ideally for a physical system, interface effects should be much faster than the spreading. This would indicate that we could be interface limited for spreading, which is bad.
Melting dependence on solid diffusion coefficient
It is becoming apparent that the rate of melting is not independent of the solid diffusion coefficient. The following link is to a movie where the system has the same viscosity and diffusion coefficient (2E-3 and 2E-2) in all phases.
The dynamics is quite fast and the pressure and chemical potential equilibration occur at similar time scales. There is a nice concentration gradient and the phase filed relaxes back as the liquid melts. Note how the chemical potentials equilibrate nicely through the interface and large gradients are quickly removed.
The following is a simulation where the solid diffusion coefficient is 2E-11 (down from 2E-2).
From a pure diffusion argument the solid diffusion coefficient should have no influence on the melting rate of the liquid. Clearly in this example it is having an effect. If you look at frames from similar times in both the above simulations, the rate of melting is clearly different and the concentration profile in the liquid has no gradients in the second movie. Also, note that the chemical potential has not equilibrated through the solid liquid interface. This is very interesting.
This does seem somewhat disturbing. However, if we look at the diffusion equation, there are two time scales, the first being the diffusion time scale through the liquid,
where is the length of the liquid phase (it can't be shorter than this because the diffusion equation creates a profile across the entire liquid domain!). The second time scale is and interface time scale associated with the fourth order term,
where is the depth of the interface. is very fast about 9.27E-9 equivalent to the other time scales in the system such as the phase field and inertial flow scales. Since the chemical potential needs to equilibrate through the entire interface I am going to postulate that I can use \bar{M} in the solid to evaluate giving a time scale of 2.35E-4. This is very slow. This represents a worse case since in the interface is interpolated between the liquid and solid, but there are still regions of the interface that require a time scale close to this for equilibration.
This argument suggests that needs to be large enough or needs to be small enough such that and only at this point is melting independent of . This cross over point for our numbers is roughly when is three or four orders of magnitude less than .
The other thing to note is that for a physical system is much smaller so for a realistic system in general.
Another hand wavy argument may be to say that represents the time scale for the initial formation of the interface and if this is really slow all bets are off.
I am beginning to realize why it is important to have a more realistic interface depth. The above argument suggests the diffusion coefficient should be large enough such that the time scale associated with the fourth order term in the diffusion equation is always faster than other relevant time scales. All the spreading dynamics will be screwed up otherwise.
MPI command
mpirun -np 12 -machinefile machinefile ./input.py >> out 2>&1 &
Parallel on the Cluster
Results for parallel simulations on the cluster
- 1 processor, r005
- sweep times: 29.3520491123 23.3660550117 23.3256959915 23.1911690235 23.1299791336 22.9121041298 22.8525710106 23.0343871117 22.8907659054 23.039219141 23.1625990868 23.013864994
- 2 processors, r006 r005
- sweep times: 15.5924270153 12.1713759899 12.2448968887 12.2202451229 12.2246148586 12.2457139492 12.3183469772 12.3248221874 12.215831995 12.2335240841 12.2320721149
- 4 processors, r006 r007 r006 r005
- sweep times: 21.6786100864 6.6802740097 6.63790607452 6.84371995926 6.65304708481 7.14012789726 6.46965503693 6.99537992477 6.83790421486 6.62651705742 6.71980905533 6.77136397362 6.41707897186
- 8 processors, r007 r008 r006 r007 r006 r008 r007 r005
- sweep times: 17.380297184 17.7295508385 17.6523389816 17.4138000011 31.6395330429 17.4052929878 24.2372221947 31.4160881042 3.41191792488 3.63104009628 3.52340602875 10.1916849613 17.3036520481 10.5115458965 3.95448207855 3.49768590927
- 16 processors, r007 r012 r008 r006 r007 r006 r022 r013 r012 r012 r008 r007 r011 r020 r010 r005
- sweep times: 30.4704580307 15.7536921501 16.4713590145 17.762444973 3.03759098053 3.31603288651 15.6013848782 9.57383298874 43.7733981609 3.11927390099 3.11927390099 15.9374248981 2.89413809776 2.70106816292 2.70635294914 16.7859950066
Results demonstrate:
- good scaling from 1 to 4 processors
- something weird happening at 8 processors
- fastest times for 8 processors does scale correctly, might indicate network traffic
- similarly, for 16 processors there is a 2.7s sweep
Melting Simulations
- 125 is the base simulation
- 126 is with 50% reduction of .
- 127 is with increased by an order of magnitude
- Not a great deal of variation will take higher.
- 126 is has the slowest spreading. makes sense since a reduction in concentration reduces spreading rate (slightly more melting maybe).
- 127 is faster as is increased
Base Simulation with Higher Viscosity
- 119 is the original base simulation with
- 125 has a higher solid viscosity and a deeper solid region
- 128 is the same as 125, but with a tolerance of instead of .
- Figure demonstrates that 128 and 125 map fairly well indicating that is adequate
- 119 has slower spreading dynamics than 125, which is counter intuitive, this may have something to do with the deeper solid region
- It is clear from the movies for source:trunk/reactiveWetting/119/movie/119.gif and source:trunk/reactiveWetting/125/movie/125.gif, that the solid wave are substantially decreased with the higher solid viscosity.
Dissolution Simulations
The simulations in the table below are being run as I write up the paper, most are very close to production quality.
ID | Notes | diff | Issues | Status | movie | |
119 | hydrodynamic limit | solid waves, wrong value of X1 close to equilibrium | stopped | 32 | movie | |
120 | removed pytables dependencies to run on rareton | diff | stopped | |||
121 | finer grid density | diff | better initial conditions required | stopped | 32 | movie |
122 | and deeper solid | diff | will induce solidification | stopped | 6 | movie |
123 | diff | drop wets top surface | stopped | 11 | ||
124 | and deeper vapor | diff | will induce solidification | stopped | 11 | movie |
125 | , hydrodynamic limit | diff | less waves, but better initial conditions required | stopped | 12 | |
126 | diff | no melting | stopped | 6 | ||
127 | diff | increasing mobility will not induce melting | stopped | 12 | ||
128 | diff | better initial conditions required | stopped | 12 | ||
129 | and | diff | stopped | 12 | ||
130 | diff | increasing mobility will not induce melting | stopped | 4 | ||
131 | diff | increasing mobility will not induce melting | stopped | 4 | ||
132 | diff | X1 has weird profile due to smoothing | stopped | 12 | ||
133 | smoothing and | diff | worked, but fast diffusion not required | sropped | 4 | |
134 | 1D to see eventual melting | diff | not required smaller nx works | stopped | 4 | |
135 | diff | duffusion fast enough, large dissolution impossible | stopped | 4 | ||
136 | , 1D, nx=100 | diff | not required smaller nx works | stopped | 4 | |
137 | 1D, nx=50, | diff | wrong initial conditions | stopped | 4 | |
138 | hydrodynamic limit | diff | stopped | 12 | ||
139 | 1D, nx=10, , shift=0.0 | diff | nx < 29 doesn't seem to work or different shifts | stopped | 4 | |
140 | diff | stopped | 4 | |||
141 | diff | stopped | 12 | |||
142 | nx=2 | diff | calculation broke down with nx =2 | stopped | 4 | |
143 | same as 137 but corrected initial conditions | diff | stopped | 4 | ||
144 | same as 141 but | diff | stopped | 4 | ||
145 | 143: | diff | stopped | 4 | ||
146 | 63: with modifed unsegregatedSolver from 145 and NC1, NC2 = 0 between sweeps | diff | stopped | 1 | movie | |
147 | 146: | diff | stopped | 1 | movie | |
148 | 147: | diff | stopped | 1 | movie | |
149 | 147: | diff | stopped | 1 | movie | |
150 | 148: | diff | stopped | 1 | movie | |
151 | 150: | diff | stopped | 1 | ||
152 | 150: | diff | stopped | 1 | ||
153 | 150: | diff | stopped | 1 | ||
154 | 150: | diff | stopped | 1 | ||
155 | 150: | diff | stopped | 1 | ||
156 | 150: | diff | stopped | 1 | ||
157 | 150: when | diff | stopped | 1 | ||
158 | 157: | diff | stopped | 1 | ||
159 | 154: half interface depth and N=1600 | diff | stopped | 1 | ||
160 | 154: | diff | stopped | 1 | ||
161 | 159: | diff | stopped | 1 | ||
162 | 149: | diff | stopped | 1 | ||
163 | 154: | diff | stopped | 1 | ||
164 | 154: | diff | stopped | 1 | ||
165 | 154: | diff | stopped | 1 | ||
166 | 150: | diff | stopped | 1 | ||
167 | 166: | diff | stopped | 1 | ||
168 | 167: box interpolation | diff | stopped | 1 | ||
169 | 167: , | diff | stopped | 1 | ||
170 | 167: , | diff | stopped | 1 | ||
171 | 170: , | diff | stopped | 1 | ||
172 | 170: , | diff | stopped | 1 | ||
173 | 167: , | diff | stopped | 1 | ||
174 | 173: , | diff | stopped | 1 | ||
175 | 174: , | diff | stopped | 1 | ||
176 | 175: , sharp 0.95 and 0.05 | diff | stopped | 1 | ||
177 | 176: | diff | stopped | 1 | ||
178 | 177: | diff | stopped | 1 | ||
179 | 176: | diff | stopped | 1 | ||
180 | 179: | diff | stopped | 1 | ||
181 | 179: | diff | stopped | 1 | ||
182 | 179: | diff | stopped | 1 | ||
184 | 138: 2D, | diff | stopped | 12 | ||
185 | 184: 2D, | diff | stopped | 12 | ||
186 | 184: 2D, | diff | stopped | 4 | ||
187 | 184: 2D, | diff | stopped | 12 | ||
188 | 186: 2D, | diff | stopped | 12 | ||
189 | 186: 2D, | diff | calculations broke down | stopped | 4 | |
190 | 186: 2D, | diff | calculations broke down | stopped | 4 | |
191 | 189: 2D, | diff | stopped | 4 | ||
192 | 189: 2D, | diff | stopped | 4 | ||
193 | 189: 2D, | diff | stopped | 4 | ||
194 | 189: 2D, | diff | stopped | 4 | ||
195 | 186: 2D, , | diff | stopped | 12 | ||
196 | 195: 2D, | diff | stopped | 12 | ||
197 | 196: 2D, factor=2 | diff | stopped | 32 | ||
198 | 185: 2D, | diff | stopped | 12 | ||
199 | 185: 2D, | diff | stopped | 12 | ||
200 | 185: 2D, | diff | stopped | 12 | ||
201 | 185: 2D, | diff | stopped | 12 | ||
202 | 199: 2D, | diff | stopped | 4 | ||
203 | 199: 2D, | diff | stopped | 4 | ||
204 | 198: 2D, | diff | stopped | 12 | ||
205 | 198: 2D, | diff | crashed | benson, hobson, alice (stopped) | 12 | |
206 | 198: 2D, | diff | crashed | stopped | 12 | |
207 | 198: 2D, | diff | crashed | stopped | 4 | |
208 | 198: 2D, | diff | stopped | 4 | ||
209 | 198: 2D, | diff | crashed | cluster 411 (stopped) | 4 | |
210 | 198: 2D, | diff | cluster 412 (stopped) | 4 | ||
211 | 198: 2D, | diff | cluster 413 (stopped) | 4 | ||
212 | 198: 2D, | diff | crashed | renfield, sancho, kato (stopped) | 12 | |
213 | 212: 2D, 5000 iterations | diff | crashed | cluster 418 (stopped) | 4 | |
214 | 205: 2D, | diff | crashed | poole (stopped) | 4 | |
215 | 212: 2D, | diff | crashed | luggage (stopped) | 12 | |
216 | 205: 2D, | diff | crashed | benson, hobson, alice (stopped) | 12 | |
217 | 212: 2D, | diff | crashed | renfield, sancho, kato (stopped) | 12 | |
218 | 198: | diff | crashed | luggage (stopped) | 12 | |
219 | 218: | diff | cluster 424 (stopped) | 4 | ||
220 | 218: | diff | cluster 425 (stopped) | 4 | ||
221 | 218: | diff | cluster 426 (stopped) | 4 |
New base simulation
ID | Notes | diff | Issues | Status | movie | |
222 | base | diff | benson, hobson, alice | 12 | ||
223 | diff | None | luggage (stopped) | 12 | ||
224 | diff | crashed | luggage (stopped) | 12 | ||
225 | diff | luggage | 12 | |||
226 | diff | crashed | renfield, sancho, kato (stopped) | 12 | ||
227 | 224: , , | diff | crashed | luggage (stopped) | 12 | |
228 | 225: , , | diff | crashed | renfield, sancho, kato (stopped) | 12 | |
229 | 226: , , | diff | crashed | cluster (stopped) | 4 | |
230 | 222: | diff | crashed | cluster 12.0 (stopped) | 4 | |
231 | 222: | diff | crashed | oddjob (stopped) | 4 | |
232 | 222: | diff | crashed | oddjob (stopped) | 4 | |
233 | 222: | diff | cluster 7.0 | 4 | ||
234 | 225: | diff | crashed | oddjob (stopped) | 3 | |
235 | 225: | diff | None | cluster 2.0 (stopped) | 4 | |
236 | 225: | diff | crashed | cluster 31.0 (stopped) | 4 | |
237 | 224: | diff | crashed | renfield, sancho, kato (stopped) | 12 | |
238 | 224: | diff | crashed | cluster 3.0 (stopped) | 4 | |
239 | 230: | diff | crashed | cluster 4.0 (stopped) | 4 | |
240 | 222: | diff | crashed | luggage (stopped) | 12 | |
241 | 223: | diff | crashed | cluster 5.0 (stopped) | 4 | |
242 | 235: | diff | crashed | cluster 6.0 (stopped) | 4 | |
243 | 226: , shift=dx | diff | None | renfield, sancho, kato (stopped) | 12 | |
244 | 226: , shift=10 * dx | diff | None | luggage (stopped) | 12 | |
245 | 226: , shift=100 * dx | diff | None | cluster 28 (stopped) | 4 | |
246 | 226: | diff | None | cluster 29 (stopped) | 1 | |
247 | 225: , shift=dx | diff | crashed | luggage (stopped) | 12 | |
248 | 224: , shift=dx | diff | crashed | cluster 30 (stopped) | 4 | |
249 | 247: , shift=2 * dx | diff | crashed | renfield, sancho, kato (stopped) | 12 | |
250 | 247: , , shift=dx | diff | crashed | luggage (stopped) | 12 | |
251 | 240: , , shift=dx | diff | crashed | cluster 31 (stopped) | 4 | |
252 | 243: , shift=dx, | diff | cluster 32 | 4 | ||
253 | 243: , shift=dx | diff | crashed | cluster 33 (stopped) | 4 | |
254 | 253: , shift=dx, | diff | crashed | renfield, sancho, kato (stopped) | 12 | |
255 | 254: , shift=dx, | diff | crashed | renfield, sancho, kato (stopped) | 12 | |
256 | 255: , shift=dx, | diff | crashed | cluster 34 (stopped) | 4 | |
257 | 251: , , shift=2*dx, | diff | renfield, sancho, kato | 12 | ||
258 | 256: , shift=dx, | diff | crashed | cluster 35 (stopped) | 4 | |
259 | 248: , shift=2 * dx | diff | crashed | cluster 36 (stopped) | 4 | |
260 | 250: , , shift=2 * dx | diff | crashed | renfield, sancho, kato (stopped) | 12 | |
261 | 259: , shift=10 * dx | diff | crashed | luggage (stopped) | 12 | |
262 | 257: , , shift=10*dx, | diff | crashed | bunter (stopped) | 4 | |
263 | 260: , , shift=10 * dx | diff | crashed | renfield, sancho, kato (stopped) | 12 |
Fixing LAMMPI bug
Had a weird problem with the following error when using LAMMPI.
----------------------------------------------------------------------------- The selected RPI failed to initialize during MPI_INIT. This is a fatal error; I must abort. This occurred on host poole (n0). The PID of failed process was 29080 (MPI_COMM_WORLD rank: 0) ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- One of the processes started by mpirun has exited with a nonzero exit code. This typically indicates that the process finished in error. If your process did not finish in error, be sure to include a "return 0" or "exit(0)" in your C code before exiting the application. PID 29081 failed on node n0 (127.0.0.1) with exit status 1. -----------------------------------------------------------------------------
The solution involved using the command "ipcs" and "ipcrm" and is explained here:
The above post has a perl script to run it use
exec `./ipcrmAll.pl -s`
Parallel on the cluster
The table below shows sweep times for the reactive wetting simulation on the cluster (Simulation 122). The results are awful as was probably to be expected due to the bandwidth issues with the cluster.
Nproc | sweep 1 | sweep 2 | sweep 3 |
1 | 17.43 | 13.65 | 13.72 |
2 | 9.65 | 7.34 | 7.40 |
4 | 19.23 | 19.76 | 7.26 |
8 | 23.11 | 20.72 | 21.44 |
16 | 29.33 | 100.40 | 126.25 |
Simulations on Raritan
- We copied simulation 119 to 120 to remove the HDF5 calls. At this stage we don't have HDF5 installed due to build issues with szip
Commands for raritan
This submits a job with 4 nodes and 1 processor per node
qsub -q infiniband -lnodes=4:ppn=1 -W x=NACCESSPOLICY:SINGLEJOB ./myscript
Simulations
- 8-128 processors
- 8-32 processor per node basis
- dt from 1e-15 to something
- Problem size 256x256, 512x512, 1024x1024
Simulation set 1, , 256x256
: number of nodes (each node has X processors) : number of processors being used on each node : wall clock time at sweep
4 | 1 |
Equilibrium calculations
Currently running 1D equilibrium calculations to determine thermodynamics which has an equilibrium contact angle of . This is done by varying . Simulations 116, 117 and 118 are being used for this calculation. B_0 = 25418.6582832
18.98 | 31.74 | 44.16 | ||
18.53 | 31.95 | 43.74 | ||
18.50 | 32.15 | 45.45 | ||
Spatial convergence with a smaller drop
The radius of the drop in the following simulations is 4.3 m and m. The previous simulations had drop with radius 1.1 m and m.
ID | Machine | scheme | |
104 | alice (4) stopped | 150 | central difference |
105 | renfield (4) 10589 | 300 | central difference |
106 | luggage (32) 40986 | 600 | central difference |
107 | benson (4) 15976 | 150 | power law |
108 | poole (4) 15934 | 300 | power law |
109 | 600 | power law | |
110 | sancho (4) stopped | 150 | van Leer |
111 | alice (4) 11750 | 300 | van Leer |
112 | benson (4) 19216 | 150 | QUICK |
Spatial Convergence
The Figure below shows triple junction position against time for thesimulations 93, 99, 100, 101, 102, 103. What is clear is that they don't all agree. In and off itself this is not a huge problem. As long as we can show that for some grid resolution their is convergence. If under resolved calculations spread in a similar manner and have the same trends we can get away with running parameter search-type simulations at under resolved conditions. Simulation 102 is running with a 900 x 900 grid. I think the drop size can be reduced for these convergence calculations. It is fairly clear that the weird spurious velocities at the triple junction go away when the van Leer or power law schemes are used.
Central Difference N=225
Power Law N=225
Power Law N=450
van Leer N=225
Meeting With Frank
Why has the reactive wetting problem taken so damn long?
It's really difficult. No one has modeled multicomponent van der Waals liquid-vapor (and we have a solid too).
- source:trunk/reactiveWetting/paper/sandia1.png
- source:trunk/reactiveWetting/paper/sandia2.png
- flow with difficult source terms in the Navier-Stokes, standard codes don't deal with this
- parasitic currents
- pressure-correction algorithm needed to be invented
- compressible due to nature of pressure density coupling (most work has perfect gas relationship for pressure based approach)
- low Mach number compressible density coupled (rather than pressure) flow
- multicomponent decoupling
- didn't have parallel capability
- thermo
- difficulty understanding the wetting issues (the surface tensions are not input parameters)
- optimal choice of primitive variables (, , , )
- choice of diffusion coefficient
- code complexity and difficult to organize, it has to solve a variety of different problems
- did not have state of the art preconditioners and solvers
- viscosity and diffusion ratios (interpolation schemes) (requires good preconditioners and solvers)
- density trapping
Some of these issues are avoided with the three phase field approach, density is defined by the phase field variable
but the van der Waals model () has fundamental advantages
- a more fundamental physical description
- handles the critical temperature
- compression waves, which are interesting
- no arbitrary phase field equations
What is the current situation?
- movies
- improved the Schimidt number to physical (viscosity) by many orders of magnitude blog:UnsegregatedSolver#comment-1
- almost complete physical model of the system
- unphysical vapor, not a major issue (concentration, density and viscosity)
- solid viscosity
- triple point location results source:trunk/reactiveWetting/paper/position.png source:trunk/reactiveWetting/paper/interface0010840tp3.png source:trunk/reactiveWetting/paper/interface0010840tp2.png
- source:trunk/reactiveWetting/paper/interface0010840.png
What did we do to fix it?
- full matrix method
- Trilinos preconditioners and solvers
- Exponential interpolation scheme
- Full implementation of Trilinos as the "backend" for fipy, state of the art preconditioners, solvers and parallel capability
- Using the full matrix method
- allows us to leverage full capability of Trilinos preconditioners
- Parallel version
- reason to use FiPy? is to have easy ways to add new terms and components
- Talking to Aaron Lott (preconditioner expert in maths and computer div.)
Will I write a paper soon?
Yes. Here is the current draft source:trunk/reactiveWetting/paper/paper.pdf
What did we do wrong?
- in essence, this is a very deep and difficult mathematical model
- ignore pressure for "real" physical model and write intermediate papers
- two computer guys should be working together on each problem, never alone as we currently do
- better code organization
Will all this work carry over into other problems?
Yes. Certainly the parallel, full matrix and Trilinos capabilities are immediately applicable to
- Photo-Voltaics
- Superconformal Ni Deposition
Other problems that we could possibly tackle include
- Zeta-potential
- VLS nano-wire growth
- Vehicle Light-weighting
- Ink Jet Process
- mem-resistive technologies
- VLS nano-wire growth
- Crystal Plasticity with Lyle
- Martensites
What other techniques do we still need to implement?
- get all stuff back into fipy that I have developed for the Reactive Wetting code.
- second order accurate unstructured capability
- adapativity
- Newton iterations
- block preconditioners (working with Aaron Lott)
Testing Advection Schemes
As a start to getting a higher order advection scheme in FiPy I've been looking at what we already have. The following test is for the advection at 45 degrees to a Cartesian grid. The analytical solution is this on a 110 x 60 grid:
The Van Leer scheme solotion on a 110 x 60 grid:
The power law scheme on a 110 x 60 grid:
The actual convergence rates based on the L1 norm (on 5 grids with 11 * N x 6 * N cells with N=(1,2,4,8,16)):
I implemented the QUICK scheme without limiting, is essentially the Van Leer scheme without the limiting.
Timings on Luggage
- Luggage had one additional job running.
900 x 900
sweep | 1 | 2 | 4 | 8 | 16 | 32 | 60 | 62 |
1 | 677.69 | 382.98 | 230.94 | 107.78 | 56.23 | 30.74 | 16.25 | 16.53 |
2 | x | 303.69 | 192.95 | 90.93 | 51.63 | 28.6 | 14.85 | 14.43 |
1800 x 1800
sweep | 16 | 32 | 62 | 63 |
1 | 231.93 | 141.68 | 93.18 | 87.18 |
2 | 187.85 | 111.25 | 69.97 | 63.80 |
Chat with Jim --- Reactive Wetting Direction
Three main directions
- compare with Cox theory
- compare with Warren, Boettin, Roosen for spreading
- compare with Pismen & Pomeau
Full Parallel Implementation of the Reactive Wetting
- The numbers are wall clock times in seconds, cpu clock time is hard to get with mpi jobs
- The number in parentheses is the matrix solution time.
- Luggage was heavily loaded when these calculations were done
- Poole has two additional processes running.
900 x 900
sweep | 1 (poole) | 2 (poole) | 4 (poole) | 8 (luggage) | 16 (luggage) | 32 (luggage) | 64 (luggage) |
1 | x | x | x | 189.89 (1.53) | 90.41 (1.71) | 46.48 (0.43) | 34.40 (0.94) |
2 | x | x | x | 210.94 (5.00) | 75.16 (5.29) | 43.60 (2.55) | 34.48 (4.67) |
450 x 450
sweep | 1 (poole) | 2 (poole) | 4 (poole) | 8 (poole) | 8 (luggage) | 16 (luggage) | 32 (luggage) | 64 (luggage) |
1 | 73.86 (0.64) | 39.23 (0.332) | 20.62 (0.189) | 11.18 (0.102) | 26.59 (0.220) | 32.05 (0.264) | 29.24 (1.08) | 13.37 (0.24) |
2 | 61.97 (3.13) | 30.82 (1.198) | 16.07 (0.742) | 8.62 (0.461) | 23.74 (0.535) | 19.855 (0.640) | 10.22 (0.596) | 7.51 (0.795) |
225 x 225
sweep | 1 (poole) | 2 (poole) | 4 (poole) | 8 (poole) | 8 (luggage) | 16 (luggage) | 32 (luggage) | 64 (luggage) |
1 | 19.16 (0.16) | 10.16 (0.086) | 5.57 (0.04) | 3.44 (0.026) | 6.01 (0.076) | 13.82 (0.12) | 9.66 (0.14) | 16.87 (0.12) |
2 | 14.51 (0.55) | 7.56 (0.31) | 3.92 (0.18) | 2.03 (0.13) | 5.01 (0.19) | 8.36 (0.14) | 5.91 (0.22) | x |
900 x 900
sweep | 1 (poole) | 2 (poole) | 4 (poole) | 8 (luggage) | 16 (luggage) | 32 (luggage) | 64 (luggage) |
1 | x | x | x | 136.75 (1.80) | 130.58 (1.69) | 47.51 (1.87) | 36.69 (0.976) |
2 | x | x | x | 253.05 (133.37) | 194.08 (74.12) | 142.07 (84.92) | 72.17 (47.37) |
450 x 450
sweep | 1 (poole) | 2 (poole) | 4 (poole) | 8 (poole) | 8 (luggage) | 16 (luggage) | 32 (luggage) | 64 (luggage) |
1 | 81.49 (1.07) | 40.69 (0.51) | 21.46 (0.30) | 11.19 (0.16) | 55.53 (1.82) | 14.31 (0.39) | 16.32 (0.24) | 19.46 (0.52) |
2 | 158.42 (97.25) | 80.21 (49.89) | 41.32 (25.99) | 23.89 (15.94) | 83.85 (20.27) | 32.60 (20.04) | 18.87 (11.13) | 35.52 (28.13) |
225 x 225
sweep | 1 (poole) | 2 (poole) | 4 (poole) | 8 (poole |
1 | 20.03 (0.27) | 10.21 (0.12) | 5.57 (0.075) | 3.95 (0.041) |
2 | 38.44 (23.78 ) | 19.25 (11.97) | 10.08 (6.39) | 6.10 (4.16) |
Parallel Efficiency Data
The following table shows run times and memory usage for the reactive wetting problem on a number of nodes. The time step is . The machine is poole, which has 8 nodes and 32GB.
category | no mpi | mpi 1 | mpi 2 | mpi 4 | mpi 8 |
sweep 1 (s) | 85 | 85 | 74 | 53 | 39 |
sweep 2 (s) | 186 | 180 | 156 | 105 | 81 |
sweep 3 (s) | 289 | 284 | 243 | 161 | 136 |
memory high (GB) | 0.5 | 1.1 | 3.4 | 6.4 | 10.2 |
memory stable (GB) | 0.5 | 1.1 | 3.4 | 3.9 | 2.8 |
The memory usage results are very strange. Initially the memory spikes at a high value somewhere in the initialization phase (memory high) and then for the rest of the simulation (all subsequent sweeps) the memory stays at or around the stable value (memory stable). The stable value seems to be independent of the number of processors, but is higher when there is more than one processor. The memory usage is taken from top in the section with Mem:XXXX total, YYYY used. I'm using YYYY minus the initial YYYY to get the number.
Numpy on the Altix
Thus far, I've done the following
I've added
[mkl] library_dirs = /opt/intel/Compiler/11.0/074/mkl/lib/64 lapack_libs = mkl_lapack mkl_libs = mkl_intel_lp64, mkl_intel_thread, mkl_lapack, mkl_core
to the site.cfg file in numpy and the built numpy with the following command
python setup.py config --compiler=intel build_clib --fcompiler=intel build_ext --compiler=intel --verbose build install --prefix=${USR}
Currently I'm getting an import error:
>>> import numpy
Traceback (most recent call last):
File "<stdin>", line 1, in ? File "/users/wd15/usr/ia64/2.6.16.60-0.23.PTF.403865.0-default//lib/python2.4/site-packages/numpy/init.py", line 143, in ?
import linalg
File "/users/wd15/usr/ia64/2.6.16.60-0.23.PTF.403865.0-default//lib/python2.4/site-packages/numpy/linalg/init.py", line 47, in ?
from linalg import *
File "/users/wd15/usr/ia64/2.6.16.60-0.23.PTF.403865.0-default//lib/python2.4/site-packages/numpy/linalg/linalg.py", line 29, in ?
from numpy.linalg import lapack_lite
ImportError?: /opt/intel/Compiler/11.0/081/mkl/lib/64/libmkl_intel_thread.so: undefined symbol: kmpc_end_critical
Compiling Trilinos on the Altix with the Intel compilers
This looks promising (from a trilinos mailing list entry):
../configure CC=icc CXX=icc F77=ifort CFLAGS="-O3 -fPIC" CXXFLAGS="-O3 -fPIC -LANG:std -LANG:ansi -DMPI_NO_CPPBIND" FFLAGS="-O3 -fPIC" --prefix=${USR} --with-install="/usr/bin/install -p" --with-blas="-L/opt/intel/Compiler/11.0/081/mkl/lib/64 -lmkl -lpthread" --with-lapack="-L/opt/intel/Compiler/11.0/081/mkl/lib/64 -lmkl_lapack -lpthreads" --enable-mpi --enable-amesos --enable-ifpack --enable-shared --enable-aztecoo --enable-epetra --enable-epetraext --enable-external --enable-ml --enable-threadpool --enable-thyra --enable-stratimikos --enable-triutils --enable-galeri --enable-zoltan --cache-file=config.cache
This is the "normal" thing we do on 64 bit:
../configure CXXFLAGS="-O3 -fPIC" CFLAGS="-O3 -fPIC" FFLAGS="-O5 -funroll-all-loops -fPIC" F77="g77" --enable-epetra --enable-aztecoo --enable-pytrilinos --enable-ml --enable-ifpack --enable-amesos --with-gnumake --enable-galeri --cache-file=config.cache --with-swig=$PATH_TO_SWIG_EXECUTABLE --prefix=$LOCAL_INSTALLATION_DIR
Preconditioning
Chat with Aaron. We need to do the following:
- figure out max time step for trilinos jacobi, block jacobi, mL and block ML (where I change the A that is passed, only the diagonal blocks)
- look at overall solution time as well
- number of iterations
Working 2D simulations with whole matrix method (unsegregated solver)
Finally have working 2D simulations with the whole matrix method. Simulation source:trunk/reactiveWetting/65 and source:trunk/reactiveWetting/66 both have , but and , respectively. Movies are in the source directories for velocity, density, concentration and phase. The sharp interface simple analytic solution is plotted in the images. It is questionable whether the results have reached a final equilibrium, there is a sift from the analytic curve, especially for simulation 65. The diffusion is an order of magnitude slower in this case. It is probable that this simulation has to run for longer in order to see convergence to the equilibrium result. The final equilibrium shapes for simulation 65 and 66 are in source:trunk/reactiveWetting/65/conc0005400.png and source:trunk/reactiveWetting/66/conc0005400.png, respectively.
Convergence
Figures source:trunk/reactiveWetting/65/velocity.png and source:trunk/reactiveWetting/66/velocity.png show the maximum velocity against time. Neither simulation has completely converged, it may be that the drops still need to adjust to equilibrium or the velocities could be parasitic in nature.
Next steps
- Run a simulation with . This may converge a little better. (67)
- Run a simulation with but for a grid of 200 * 200 (68)
- Profile for speed and adjust.
- axi-symmetric drops.
1D solid liquid vapor simulation
source:trunk/reactiveWetting/63/movie.gif
This is a 1D solid-liquid-vapor simulation with the unsegregated solver. The parameter values are and everywhere.
Unsegregated Reactive Wetting
Equilibrium calculations
ID | Changes | Notes |
47 | with 48 | Used the trilinos solver and newer simulations use superlu |
52 | with 47 | Used the trilinos solver and newer simulations use superlu |
53 | with 47 | Used the trilinos solver and newer simulations use superlu |
60 | with 47 | using superlu |
Convergence calculations
ID | Changes | Notes |
48 | with 47 | |
49 | with 48 | |
50 | with 48 | |
51 | with 48 |
Experiment with new solver
Trying to optimize the solver so it doesn't crash. Using superlu really helps.
ID | Changes | Notes |
54 | with 51 |
Convergence check with new method
ID | Changes | Notes |
55 | with 54 | |
56 | with 55 | |
57 | with 55 | |
58 | with 55 | |
59 | with 55 |
Cleaning up input files
ID | Changes | Notes |
61 | with 55 |
Unsegregated solver class and term multiplication
Simulation 61 represents a binary liquid-vapor using the Unsegregated solver class and the term multiplication branch. It seems to work.
Reactive Wetting Equations in Cylindrical Coordinates
Note
We need to remove the radial distance and embed it in the term's coefficient. For a general equation in cylindrical coordinates that are axisymmetric the general equation can be written,
Multiply through by to get,
which is,
Thus, in general, it is very easy to recast equations in cylindrical axi-symmetric coordinates with fipy. The fourth order term in the concentration equations may be an issue and the correction equation also has a fourth order term.
Fourth Order Term
This can be rewritten as
There is an awkward that is not next to a and thus can not just multiply a regular coefficient. The outer gets dealt with because we multiply all the terms in the equation by . The above can be rewritten as,
Unfortunately, The last term needs to be added as a source term. If we had a third order convection term then it could be added implicitly.
Continuity
Velocity
Concentration Equation
The last term can remain in its original form as it is a convection term.
Alternate Method
The formulation above could lead to some coding issues and maybe other difficulties concerning the fourth order source terms. A more expedient and relatively trivial method that is probably no less accurate is to define a CylindricalGrid2D object that has its mesh volumes and face areas modified accordingly. I just realized that the discretiztion is exactly equivalent in the finite volume method when using a cylindrically modified grid and rewriting the equations.
Reactive Wetting Paper
Outline
- Introduction
- Boiler plate - motivation
- Summary of physical parameters coinciding with typical applications (water-ice-salt, solder alloys)
- Thermodynamics
- Model
- Transport
- Thermodynamics
- Surface Tensions
- Parameters Choices - Numerics
- Governing equations summary.
- Numerical Approach
- Solution algorithm
- Parasitic Currents
- One dimensional liquid vapor analysis
- Dimensionless form of the pure-liquid vapor
- Convergence of a one-dimensional pure liquid-vapor system
- ,
- ,
- liquid
- Convergence of a one-dimensional binary liquid vapor system
- , , What is the meaning?
- issues.
- Two dimensional Binary Solid-Liquid-Vapor
- Obtaining Plausible contact angles
- solid
- Approximation of a solid with , density trapping.
- Comparison with analytical solution in equilibrium
- Results
- Extracting interface locations and junctions with phase field methods
- Ridge detection
- Wheeler method
- Villanueva method
- versus .
- Cox comparisons - look at later cox work with surfactants.
- for different initial conditions.
- Pure Hydrodynamics with and without reactive wetting ( vs. )
- Pretty pictures
- Compare with Gustav
- Steady state versus transient
- Compare streamlines
- versus diffusion
- versus
- Compare with S. Troian
- Spherical cap deviation
- Best measures of angle and shape
- Compare with Jim and Bill's right angle assumption
- Extracting interface locations and junctions with phase field methods
Copies of notes
Contact Angle Metrics
Given the simulations from here, we can predict the triple point motion and contact angles.
The following figure show the triple point position against time normalized by the initial radius of the drop.
The triple point position is given by
where
and
and
The roughness of the curves in the figure is due to the zero order interpolation of . Currently, the center of the cell with the minimum value is deemed to be the triple point. Higher order interpolation should improve the jagged nature of the curves. Need at least second order interpolation.
This movie shows the four points that are chosen to determine the triple point angles. The position of the four points seems reasonable.
The solid vapor interface point, such that,
where
with
The radius of the circle is chosen to be . We'll need to see how contact angle varies with this parameter.
Contact Angles
The links in the table below show contact angles versus time using the method described above.
ID | Figure | ||
2 | figure | ||
4 | figure | ||
6 | figure | ||
7 | figure | ||
8 | figure | ||
9 | figure | ||
10 | figure | ||
12 | figure |
Thoughts:
- The initial contact angles in these figures seem to be somewhat wrong. The solid contact angle should be while the other angles should be .
- Initial contact angle should improve with bigger droplets and this isn't happening.
- The liquid and vapor angles both seem to drift away from the equilibrium values.
- The curves are extremely jagged.
- The surface tensions used to calculate the equilibrium contact angle could be for a different thermo. Need to double check.
- Even though the system is not in equilibrium the contact angles should probably be in equilibrium.
What next?
- Figure out why the initial angles are wrong.
- Plot the equilibrium solution against that the last frame in the simulation to figure out how close to equilibrium.
- Use higher order interpolation for the contact points to improve the smoothness.
- Making smaller may improve the results.
- Test the equilibrium contact angle in the case where the solid has the same viscosity as the liquid, so that the system reaches equilibrium. Probably need to run that case again.
- run a full sphere on a solid surface that doesn't melt (correct equilibrium concentrations). Use this to compare with cox stuff.
- vary for one frame of a simulation to see how the angle changes.
Jobs
Two new jobs
ID | PID | machine | status | (m) | movie | Angles |
25 | 13638.0 | cluster | stopped | 0.4 | movie | |
26 | 13639.0 | cluster | R | 0.8 | movie | figure |
27 | 13740.0 | cluster | R | 0.35 | movie | figure |
28 | 12539 | poole | R | 1.8 | movie | figure |
29 | 14230.0 | cluster | R | 0.35 | movie | figure |
Various plots for the simulations above,
Issues:
- The number seems to be larger than Cox's theory predicts.
- The drop off is slower than Cox's theory predicts.
- Cox's theory is for high number. Need to run some simulations in this regime.
- Need to improve data sampling. This is relatively easy to do by refining the grid and interpolating the base fields.
- axi-symmetric may also be required
Peclet Number
A rough estimate for the Peclet Number is given by
If we take an interface length for of . At early times the Peclet number is 100 in the solid and 0.01 in the liquid. This indicates that if we give the solid value of to the fluid then the Peclet number will be large everywhere and will hopefully allow a better fit to Cox's theory. Simulation 29 in the table above is running with
Jobs with lower Peclet number
ID | PID | machine | status | (m) | movie | Angles | |||
29 | 14230.0 | cluster | unstable | 0.35 | movie | figure | |||
30 | 14256.0 | cluster | finished | 0.35 | movie | figure | |||
31 | 14257.0 | cluster | finished | 0.35 | movie | figure | |||
32 | 14258.0 | cluster | unstable | 0.35 | movie | figure | |||
33 | 14280.0 | cluster | R | 0.35 | 0 | movie | figure | ||
34 | 14296.0 | cluster | R | 0.35 | 0 | movie | figure |
Higher order interpolation
It works!
High Peclet number
Can't seem to run the code with a Peclet number of about 10 and above. This is certainly connected with other issues. The figure below shows the Cox theory fit for decreasing corresponding to increasing Peclet number. As the Peclet number is increasing, the curvature of the simulation points is increasing. This seems to show it is getting closer to the theory, though it is hard to tell. It is probable that as the Peclet number passes through 1 there will be a shift from one spreading regime to another.
Mbar=0
Far out. By solving for and separately and with corrections the system can be solved when . It's interesting that the system remains stable. The movie for this simulations is movie. I can't use this for Cox's theory as the system does not seem to movie towards the equilibrium contact angles. What are the equilibrium contact angles in the case when . It's also interesting that the pressure remains higher in the fluid state.
Low viscosity
The next step is try the system with a much lower viscosity in the fluid. Running a job with . Working on this issue made me think about modeling the system by solving for and instead. Simulation 35 is my attempt at this issue. I added RHS flux back in to each of these equations and did the variational derivative. Unfortunately this wasn't really successful at low viscosity. Same sort of issues as before.
Simulations with varying solver tolerance
ID | status | condor ID | Tolerance | Step Time (s) | Steps | Distance (R) | Sim Time (s) | Bill's movie | ||||
18 | stopped | 12087.0 | 120 | 460 | 0.13 | 0.00256 | movie | |||||
19 | stopped | 12088.0 | 96 | 460 | 0.13 | 0.00256 | movie | |||||
20 | stopped | 12089.0 | 66 | 460 | 0.13 | 0.00256 | movie | |||||
21 | stopped | 12090.0 | 156 | 460 | 0.13 | 0.00256 | movie | |||||
22 | running | 12093.0 | -- | -- | -- | -- | -- | -- | -- | movie |
Notes
- These simulations need to run for longer to ascertain the long term effects of larger solver tolerances.
- According to the Trilinos manuals, the default stopping criterion is .
- I'm still confused about how the tolerance relates to the error. Certainly I know that the error is linear with residual (Ae=r). This would say that an order of magnitude decrease in the tolerance leads to an order of magnitude decrease in error. However if the tolerance is 1 then no iterations should be done, which would be pathological. I am running a tolerance = 1 case, so we'll see.
Altix and !FiPy. What can we expect?
What are the expected performance benefits for FiPy using the Altix machine?
The following questions are pertinent.
- Which examples?
- What proportion of time is spent in routines that could benefit from a parallel capability?
- How long do the jobs currently take on one processor?
- How much memory do the jobs currently use?
job ID | PID | Status | Radius Drop Size | N | Proportion | Solver | Max Memory (Gb) | Steps | Time Per Step (s) | Time (days) | Simulation time (s) | Altix (k=1.6) (days) | Poole (k=1.6) |
14 | stopped | 34000 | 0.774 (@) | 0.542 | 0.41 | 450 | 185 | 0.96 | 0.0016 | 0.21 | 0.28 | ||
15 | stopped | 136000 | 0.806 | 0.579 | 0.99 | 680 | 863 | 6.8 | 0.00226 | 1.5 | 2.0 | ||
16 | poole 4430 | running | 544000 | 0.822 (+) | 0.562 (+) | 3.3 | 929 (%) | 3336.0 | 35.9 (%) | 0.00275 (%) | 8.0 | 11 | |
17 | poole 4714 | running | 2176000 | 0.798 (&) | 0.496 (&) | 13 | 50667 (#) | 11034.0 | 190 (647.1(#)) | 0.0293 (#) | 42 | 56 | |
4896000 | 28 | 502 | 112 | 148 | |||||||||
483 | 15420 | 3440 | 4552 |
- (+) still need many more data points
- (%) reasonable extrapolation, waiting for more data points
- (#) very bad extrapolation, waiting for more data points
- (&) only at 200 steps, this number rises until 600 steps
- (@) steps are too quick to get a good number
Notes
The simulations wall clock time requires the triple point to cover a distance of R / 10.
The memory usage seems to go like with and .
The simulation times are estimated with a power law, with and .
The following figure shows the time spent in routines that could benefit from a parallel capability .
A quick study of solver iterations and tolerance may also yield gains.
Other functions that are costly, but are not readily reimplemented in parallel.
Radius Drop Size | N | FillComplete? | InsertGlobalValues? | array |
34000 | 0.068 | 0.039 | 0.015 | |
136000 | 0.052 | 0.024 | 0.017 | |
544000 | 0.052 | 0.026 | 0.017 | |
2176000 | 0.057 | 0.027 | 0.020 |
Poor weave results on the altix
The times below are very bad. I have no idea why this is the case. Poole had great speed ups.
N | Threads | speedup | speedup factor | raw times (s) |
500 | 0 | 1.00 | 1.00 | 0.50 |
500 | 1 | 0.33 | 0.00 | 1.50 |
500 | 2 | 0.65 | 0.65 | 0.76 |
500 | 4 | 1.29 | 1.13 | 0.38 |
500 | 8 | 2.49 | 1.36 | 0.20 |
500 | 16 | 4.43 | 1.45 | 0.11 |
500 | 32 | 6.08 | 1.43 | 0.08 |
500 | 64 | 6.35 | 1.36 | 0.08 |
1000 | 0 | 1.00 | 1.00 | 12.37 |
1000 | 1 | 0.62 | 0.00 | 20.06 |
1000 | 2 | 1.05 | 1.05 | 11.76 |
1000 | 4 | 1.88 | 1.37 | 6.57 |
1000 | 8 | 2.75 | 1.40 | 4.50 |
1000 | 16 | 2.91 | 1.31 | 4.25 |
1000 | 32 | 3.11 | 1.25 | 3.98 |
1000 | 64 | 3.38 | 1.23 | 3.66 |
2000 | 0 | 1.00 | 1.00 | 179.58 |
2000 | 1 | 0.74 | 0.00 | 244.13 |
2000 | 2 | 1.43 | 1.43 | 125.62 |
2000 | 4 | 2.91 | 1.71 | 61.63 |
2000 | 8 | 4.20 | 1.61 | 42.77 |
2000 | 16 | 4.97 | 1.49 | 36.16 |
2000 | 32 | 4.89 | 1.37 | 36.74 |
2000 | 64 | 5.87 | 1.34 | 30.60 |
3000 | 0 | 1.00 | 1.00 | 510.73 |
3000 | 1 | 0.71 | 0.00 | 719.87 |
3000 | 2 | 1.26 | 1.26 | 404.31 |
3000 | 4 | 2.30 | 1.52 | 221.79 |
3000 | 8 | 3.82 | 1.56 | 133.80 |
3000 | 16 | 3.98 | 1.41 | 128.48 |
3000 | 32 | 4.07 | 1.32 | 125.33 |
3000 | 64 | 4.80 | 1.30 | 106.35 |
Posting code
Posting some code because it can't be shared using svn.
1 import numpy 2 from scipy import weave 3 4 def multiply(N, nthreads): 5 6 i, j = numpy.indices((N, N), 'd') 7 c = numpy.zeros((N, N), 'd') 8 9 a = i + j 10 b = i * j 11 12 code_pragma = """ 13 14 int i, j, k; 15 16 #pragma omp parallel shared(a,b,c,nthreads) private(i,j,k) num_threads(nthreads) 17 { 18 19 #pragma omp for schedule (dynamic) 20 for (i=0; i<N; i++) 21 for(j=0; j<N; j++) 22 for (k=0; k<N; k++) 23 c[i * N + j] += a[i * N + k] * b[k * N + j]; 24 25 } 26 27 """ 28 29 code_no_pragma = """ 30 31 int i, j, k; 32 33 for (i=0; i<N; i++) 34 for(j=0; j<N; j++) 35 for (k=0; k<N; k++) 36 c[i * N + j] += a[i * N + k] * b[k * N + j]; 37 38 """ 39 40 args = {'N' : N, 'a' : a, 'b' : b, 'c' : c, 'nthreads' : nthreads} 41 42 if nthreads == 0: 43 code = code_no_pragma 44 else: 45 code = code_pragma 46 47 import time 48 49 t0 = time.time() 50 51 weave.inline(code, 52 args.keys(), 53 local_dict=args, 54 type_converters=None, 55 compiler = 'gcc', 56 force=0, 57 verbose=0, 58 extra_compile_args =['-O3 -fopenmp'], 59 extra_link_args=['-O3 -fopenmp'], 60 headers=['<omp.h>']) 61 62 return time.time() - t0 63 64 from fipy.tools import dump 65 66 results = {} 67 68 dump.write(results, 'datafile.gz') 69 70 for N in (500, 1000, 2000, 3000, 4000, 5000, 6000): 71 72 results = dump.read('datafile.gz') 73 results[N] = {} 74 dump.write(results, 'datafile.gz') 75 76 for nthreads in range(9): 77 78 results = dump.read('datafile.gz') 79 results[N][nthreads] = [] 80 dump.write(results, 'datafile.gz') 81 82 for sim in range(5): 83 84 results = dump.read('datafile.gz') 85 results[N][nthreads].append(multiply(N, nthreads)) 86 dump.write(results, 'datafile.gz') 87 88
openmp and weave on poole
Some results for openmp on poole. The test is for an N by N matrix multiplication. The code is here. The speed up factor is certainly improving as the threads increase. The result for N=4000 are nonsense. Probably Ed running stuff. I should really have recorded resources available during each simulation by keeping a record with time.clock() and time.time().
N | speedup factor for 8 threads |
500 | 1.45 |
1000 | 1.78 |
2000 | 1.80 |
3000 | 1.82 |
4000 | 1.30 |
5000 | 1.76 |
More of the data.
500 | 0 | 1.00 | 1.00 | 0.67 |
500 | 1 | 0.52 | 0.00 | 1.30 |
500 | 2 | 0.97 | 0.97 | 0.69 |
500 | 3 | 1.41 | 1.24 | 0.48 |
500 | 4 | 1.82 | 1.35 | 0.37 |
500 | 5 | 2.22 | 1.41 | 0.30 |
500 | 6 | 2.65 | 1.46 | 0.25 |
500 | 7 | 2.86 | 1.45 | 0.23 |
500 | 8 | 3.03 | 1.45 | 0.22 |
1000 | 0 | 1.00 | 1.00 | 11.52 |
1000 | 1 | 0.85 | 0.00 | 13.55 |
1000 | 2 | 1.65 | 1.65 | 6.98 |
1000 | 3 | 2.35 | 1.72 | 4.89 |
1000 | 4 | 3.05 | 1.75 | 3.78 |
1000 | 5 | 3.84 | 1.79 | 3.00 |
1000 | 6 | 4.43 | 1.78 | 2.60 |
1000 | 7 | 5.12 | 1.79 | 2.25 |
1000 | 8 | 5.69 | 1.78 | 2.03 |
2000 | 0 | 1.00 | 1.00 | 113.21 |
2000 | 1 | 0.89 | 0.00 | 127.71 |
2000 | 2 | 1.69 | 1.69 | 67.07 |
2000 | 3 | 2.46 | 1.77 | 45.99 |
2000 | 4 | 3.21 | 1.79 | 35.27 |
2000 | 5 | 3.96 | 1.81 | 28.56 |
2000 | 6 | 4.64 | 1.81 | 24.38 |
2000 | 7 | 5.22 | 1.80 | 21.68 |
2000 | 8 | 5.81 | 1.80 | 19.48 |
3000 | 0 | 1.00 | 1.00 | 385.15 |
3000 | 1 | 0.89 | 0.00 | 431.11 |
3000 | 2 | 1.70 | 1.70 | 226.09 |
3000 | 3 | 2.49 | 1.78 | 154.88 |
3000 | 4 | 3.24 | 1.80 | 118.80 |
3000 | 5 | 4.02 | 1.82 | 95.81 |
3000 | 6 | 4.79 | 1.83 | 80.46 |
3000 | 7 | 5.43 | 1.83 | 70.93 |
3000 | 8 | 6.03 | 1.82 | 63.85 |
4000 | 0 | 1.00 | 1.00 | 2706.37 |
4000 | 1 | 0.78 | 0.00 | 3488.43 |
4000 | 2 | 1.31 | 1.31 | 2066.20 |
4000 | 3 | 1.77 | 1.44 | 1526.49 |
4000 | 4 | 2.07 | 1.44 | 1308.90 |
4000 | 5 | 2.29 | 1.43 | 1182.51 |
4000 | 6 | 2.28 | 1.38 | 1185.99 |
4000 | 7 | 2.16 | 1.32 | 1252.05 |
4000 | 8 | 2.21 | 1.30 | 1224.98 |
5000 | 0 | 1.00 | 1.00 | 1909.29 |
5000 | 1 | 0.78 | 0.00 | 2457.51 |
5000 | 2 | 1.61 | 1.61 | 1187.01 |
5000 | 3 | 2.45 | 1.76 | 780.55 |
5000 | 4 | 3.24 | 1.80 | 589.95 |
5000 | 5 | 3.96 | 1.81 | 481.70 |
5000 | 6 | 4.47 | 1.78 | 427.20 |
5000 | 7 | 5.01 | 1.78 | 380.78 |
5000 | 8 | 5.47 | 1.76 | 349.23 |
Machine details
$uname -a Linux poole 2.6.18-6-amd64 #1 SMP Sun Feb 10 17:50:19 UTC 2008 x86_64 GNU/Linux
8 of these
processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 65 model name : Dual-Core AMD Opteron(tm) Processor 8220 stepping : 3 cpu MHz : 2812.978 cache size : 1024 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm cr8_legacy bogomips : 5630.17 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp tm stc
and memory
MemTotal: 32964928 kB
Curvature expressions in !FiPy
Curvature | Expression | FiPy Expression |
iso | (phi.getFaceGrad() / phi.getFaceGrad().getMag()).getDivergence() | |
mean | (phi.getFaceGrad() / numerix.sqrt(1 + phi.getFaceGrad().getMag()**2)).getDivergence() | |
Gauss | ((phi.getFaceGrad() * (1,0)).getDivergence() * (phi.getFaceGrad() * (0,1)).getDivergence() - phi.getFaceGrad()[::-1].getDivergence() / 2) / (1 + phi.getFaceGrad().getMag()**2)**2 |
Is there a vector form for the Gaussian curvature? It may help the FiPy expression.
Very good differential geometry book http://www.archive.org/details/differentialgeom003681mbp.
openmp on the altix
The following are some timing results on the Altix. The test case is a matrix multiplication. The time goes like N**3 where N is a side of the matrix. The numbers are averaged over 5 runs.
N | Threads | speedups | scale factor | raw times (s) |
1000 | 1 | 1.00 | 1.00 | 9.99 |
1000 | 2 | 1.69 | 1.69 | 5.93 |
1000 | 4 | 2.86 | 1.69 | 3.23 |
1000 | 8 | 4.66 | 1.67 | 2.09 |
1000 | 16 | 8.05 | 1.68 | 1.17 |
1000 | 32 | 8.08 | 1.52 | 1.05 |
1000 | 64 | 9.53 | 1.46 | 0.95 |
2000 | 1 | 1.00 | 1.00 | 134.38 |
2000 | 2 | 1.79 | 1.79 | 75.21 |
2000 | 4 | 3.26 | 1.81 | 38.96 |
2000 | 8 | 5.85 | 1.80 | 22.89 |
2000 | 16 | 10.95 | 1.82 | 12.28 |
2000 | 32 | 13.35 | 1.68 | 8.03 |
2000 | 64 | 18.32 | 1.62 | 6.81 |
4000 | 1 | 1.00 | 1.00 | 1139.32 |
4000 | 2 | 1.77 | 1.77 | 640.28 |
4000 | 4 | 3.24 | 1.80 | 348.68 |
4000 | 8 | 5.90 | 1.81 | 190.78 |
4000 | 16 | 11.03 | 1.82 | 102.81 |
4000 | 32 | 16.39 | 1.75 | 63.89 |
4000 | 64 | 22.61 | 1.68 | 44.12 |
500 | 1 | 1.00 | 1.00 | 0.32 |
500 | 2 | 1.49 | 1.49 | 0.21 |
500 | 4 | 2.91 | 1.71 | 0.11 |
500 | 8 | 2.44 | 1.35 | 0.11 |
500 | 16 | 2.91 | 1.31 | 0.11 |
500 | 32 | 1.46 | 1.08 | 0.21 |
500 | 64 | 1.46 | 1.07 | 0.21 |
The important consideration is the scale factor and how that holds up as the threads increase. The C code used on the Altix is below. Still having problems with fipy as distutils is missing. As a point of reference I found this quote in an article, "More typical code will have a lower limit; 1.7x-1.8x are generally considered very good speedup numbers for code run on two threads" (http://cache-www.intel.com/cd/00/00/31/64/316421_316421.pdf). There is a start up time associated with threading. It is recommended that threads are maintained during the duration of a programming running. This may be difficult to achieve with weave.
Note:
The results need to be tested against unthreaded code.
/****************************************************************************** * FILE: omp_mm.c * DESCRIPTION: * OpenMp Example - Matrix Multiply - C Version * Demonstrates a matrix multiply using OpenMP. Threads share row iterations * according to a predefined chunk size. * AUTHOR: Blaise Barney * LAST REVISED: 06/28/05 ******************************************************************************/ #include <omp.h> #include <stdio.h> #include <stdlib.h> int main (int argc, char *argv[]) { int tid, nthreads, i, j, k, chunk; int N=4000; // int loop; // int loops=1; double *a; a = malloc(N * N * sizeof(double *)); double *b; b = malloc(N * N * sizeof(double *)); double *c; c = malloc(N * N * sizeof(double *)); printf("finished allocating\n"); chunk = 10; /* set loop iteration chunk size */ /*** Spawn a parallel region explicitly scoping all variables ***/ #pragma omp parallel shared(a,b,c,nthreads,chunk) private(tid,i,j,k) { tid = omp_get_thread_num(); if (tid == 0) { nthreads = omp_get_num_threads(); printf("Starting matrix multiple example with %d threads\n",nthreads); //printf("Initializing matrices...\n"); } /*** Initialize matrices ***/ #pragma omp for schedule (runtime) // #pragma omp for schedule (static, chunk) for (i=0; i<N; i++) for (j=0; j<N; j++) a[i * N + j]= i+j; //#pragma omp for schedule (static, chunk) #pragma omp for schedule (runtime) for (i=0; i<N; i++) for (j=0; j<N; j++) b[i * N + j]= i*j; #pragma omp for schedule (runtime) //#pragma omp for schedule (static, chunk) for (i=0; i<N; i++) for (j=0; j<N; j++) c[i * N + j]= 0; /*** Do matrix multiply sharing iterations on outer loop ***/ /*** Display who does which iterations for demonstration purposes ***/ // printf("Thread %d starting matrix multiply...\n",tid); #pragma omp for schedule (runtime) // #pragma omp for schedule (static, chunk) // for(loop=0; loop<loops; loop++) for(i=0; i<N; i++) for(j=0; j<N; j++) for (k=0; k<N; k++) c[i * N + j] += a[i * N + k] * b[k * N + j]; } /*** End of parallel region ***/ /*** Print results ***/ // //printf("******************************************************\n"); //printf("Result Matrix:\n"); //for (i=0; i<NRA; i++) // { // for (j=0; j<NCB; j++) // printf("%6.2f ", c[i][j]); // printf("\n"); // } //printf("******************************************************\n"); //printf ("Done.\n"); }
openmp/weave timings.
A matrix multiplication in weave really scales well with openmp. The code is here. The observed speedup is almost perfect with two threads.
This code, involving large array multiplications of size N, has the following speedups with two threads.
N | Speedup |
1E7 | 1.51 |
1E6 | 1.37 |
1E5 | 1.39 |
1E4 | 1.0 |
It should be noted that the number of loops in python increased inversely with the size of the array.
The question remains whether we can get speed ups for smaller arrays typically used in FiPy.
openmp and weave
The following steps are required to build openmp to work with weave. I've tried this on poole and rosie. I followed the follwoing steps
1) Install mpfr version 2.3.1
$ ../configure --prefix=${USR} $ make $ make install
2) Get gcc version 4.3 or 4.2.4 (when it is released). This version is needed otherwise you will get the "ImportError: libgomp.so.1: shared object cannot be dlopen()ed" error. gomp was not set up for dynamic loading in early gcc openmp compatible versions.
3) Create a new directory and configure with
$ ../configure --prefix=${USR} --disable-multilib
where USR is in your local directories somewhere.
4) make; make install
Streamlines
How to make a contour plot of the streamlines for a flow field in 2D?
Given a velocity field we need a field that is constant in directions that are tangent to the velocity field. This leads to the equation,
The relationships and satisfy the equation. From the above equations we can say, Solving this gives the stream-function. If we have no inflow conditions on regular domain walls, then we can choose on these walls. Since these walls have no circulation, we have made the choice that represents no circulation, which is sensible. The circulation orientation is given by the sign of .2D simulations integrated with subversion
ID | status | condor ID | Jim's movie | Bill's movie | |||||||
1 | unstable | movie | movie | ||||||||
2 | running | 12095.0 | movie | movie | |||||||
3 | stopped | movie | movie | ||||||||
4 | running | 12096.0 | movie | movie | |||||||
5 | unstable | movie | movie | ||||||||
6 | running | 12097.0 | movie | movie | |||||||
7 | stopped | movie | movie | ||||||||
8 | running | 12100.0 | movie | movie | |||||||
9 | unstable | movie | movie | ||||||||
10 | running | poole 5545 | movie | movie | |||||||
11 | running | poole 9411 | movie | movie | |||||||
12 | stopped | movie | movie | ||||||||
13 | stopped | movie | movie |
2D simulations
Table of simulations
ID | status | condor ID | Initial | T | data directory | movie link | source files | |||||||
H2D-restart | H2D | 550 | movie | |||||||||||
I2D-restart | I2D | 550 | movie | |||||||||||
J2D-restart | running | 9099.0 | J2D | 600 | dataFri-Feb-29-12:25:35-2008 | movie | files | |||||||
K2D-restart | running | 9098.0 | K2D | 600 | dataFri-Feb-29-12:25:35-2008 | movie | files | |||||||
L2D | unstable | 9077.0 | 650 | dataWed-Feb-27-16:59:24-2008 | movie | files | ||||||||
L2D1 | unstable | 9087.0 | 650 | dataThu-Feb-28-11:27:18-2008 | movie | files | ||||||||
M2D | running | 9088.0 | 650 | dataThu-Feb-28-11:28:02-2008 | movie | files | ||||||||
N2D | swapping | 9089.0 | 650 | dataThu-Feb-28-11:31:03-2008 | movie | files | ||||||||
O2D | unstable | 9090.0 | 650 | dataFri-Feb-29-01:27:24-2008 | movie | files | ||||||||
O2D1 | running | 9108.0 | 650 | dataFri-Feb-29-14:35:04-2008 | movie | files | ||||||||
P2D | running | 9111.0 | 650 | dataSat-Mar--1-07:01:06-2008 | movie | files | ||||||||
Q2D | swapping | 9112.0 | 650 | dataSun-Mar--2-22:30:18-2008 | movie | files | ||||||||
R2D | unstable | 9083.0 | 650 | dataThu-Feb-28-07:43:56-2008 | movie | files | ||||||||
R2D1 | running | 9093.0 | 650 | dataThu-Feb-28-11:37:49-2008 | movie | files | ||||||||
S2D | running | 9094.0 | 650 | dataThu-Feb-28-11:38:37-2008 | movie | files | ||||||||
T2D | swapping | 9095.0 | 650 | dataThu-Feb-28-11:41:06-2008 | movie | files |
Upper bound on CFL condition
The simulations below do not seem to have a bounded . They do, however, have an upper bound for that decreases dramatically as is reduced. Of course, at lower the upper bound for is probably due to some other numerical factor unrelated to .
Matrix of simulations
ID | status | data directory | movie link | ||||||||||
N | finished | None | 1/4 | 1/2 | 1/4 | dataMon-Feb-25-14:55:59-2008 | movie | ||||||
N3 | unstable | None | 1/4 | 1/2 | 1/4 | dataWed-Feb-27-12:27:04-2008 | movie | ||||||
O | finished | None | 1/4 | 1/2 | 1/4 | dataMon-Feb-25-17:58:15-2008 | movie | ||||||
P | unstable | None | 1/4 | 1/2 | 1/4 | dataMon-Feb-25-18:30:26-2008 | movie | ||||||
Q | finished | None | 1/4 | 1/2 | 1/4 | dataMon-Feb-25-18:57:53-2008 | movie | ||||||
Q1 | finished | None | 1/4 | 1/2 | 1/4 | dataWed-Feb-27-14:51:24-2008 | movie | ||||||
R | unstable | None | 1/4 | 1/2 | 1/4 | dataTue-Feb-26-10:13:19-2008 | movie | ||||||
S | unstable | None | 1/4 | 1/2 | 1/4 | dataTue-Feb-26-11:45:36-2008 | movie |
2D simulations with reduced temperature
Matrix of 2D simulations with reduced temperature
ID | status | condor ID | Initial | T | data directory | movie link | |||||||
A2D | finished | 6952.0 | 650 | dataWed-Jan-16-00:48:07-2008 | movie | ||||||||
B2D | unstable | 7672.0 | A2D | 550 | dataFri-Jan-25-18:17:34-2008 | movie | |||||||
C2D | finished | 7828.0 | 650 | dataWed-Jan-30-11:53:40-2008 | movie | ||||||||
D2D | finished | 7836.0 | 650 | dataWed-Jan-30-12:19:30-2008 | movie | ||||||||
E2D | finished | 7848.0 | 650 | dataWed-Jan-30-13:52:02-2008 | movie | ||||||||
F2D | finished | 7859.0 | 650 | dataWed-Jan-30-14:09:54-2008 | movie | ||||||||
G2D | finished | 7869.0 | 650 | dataWed-Jan-30-14:15:37-2008 | movie | ||||||||
H2D | finished | 8541.0 | F2D | 550 | dataThu-Feb-14-12:09:33-2008 | movie | |||||||
I2D | finished | 8582.0 | F2D | 550 | dataFri-Feb-15-14:55:08-2008 | movie | |||||||
J2D | finished | 8803.0 | F2D | 600 | dataTue-Feb-19-10:20:01-2008 | movie | |||||||
K2D | finished | 8805.0 | F2D | 600 | dataTue-Feb-19-10:20:56-2008 | movie |
Most of these jobs ran for a good period of time, but were brought down with server problems on 02/14/08 and 02/22/08. When they restarted the condor system began them from scratch, I killed them and will restart any useful ones from the latest point in the data directory above. However, it is probably best to keep processors for the new set of 2D jobs as these will probably have unbounded time steps.
Efficiency
It is interesting to compare time versus time steps for some of the 2D jobs and the new 1D jobs. represents number of time steps.
Simulations E2D, F2D and G2D all lie on the same curve as expected. Elapsed time in simulations N and O is orders of magnitude greater than in the 2D simulations for an equivalent number of time steps. This is not the case in simulation O as it has a more restrictive CFL condition (see Table).
1D simulation with vapor
Previously, we've managed to get the 1D liquid-solid system working with all the desired numbers (apart from the liquid viscosity) with only a CFL time step stability criterion. We now need to include the vapor and see how the time step is affected. I'm assuming from previous experience and intuition that adding the vapor will require a far stricter stability limit.
Simulation With Vapor
As per usual my intuition proves to be false. The time step size is completely unrestricted. I'm extremely excited. We may be able to do the 2D simulations in orders of magnitude less real time. There are also some extra simulations to look at initial liquid fraction and domain size. It is clear from simulation N that diffusion based interface motion is occurring at these parameter values.
Diffusion Driven Interface Motion
Examining simulations N, O, Q, it is evident that interface motion is driven by diffusion. Notice also that the pressure equilibrates sooner as the viscosity is reduced and it seems that the concentration gradients are more pronounced. In simulation Q, the pressure is equilibrated much earlier, before significant interface motion occurs and concentration gradients form.
Stability
To maintain stability for lower , is being reduced. Simulations P, R, S were all unstable. The lowest value of that was successfully computed is , simulation Q. The following figure shows simulation time against number of required steps (). It is clear that as the run times are significantly increased.
Interface Motion
Interface motion plot for simulation N, O and Q. At least between and the solid liquid interface motion is mostly unaffected by .
Matrix of simulations
ID | status | data directory | movie link | ||||||||||
H | finished | None | 0 | 1/2 | 1/2 | dataThu-Feb-21-17:46:23-2008 | movie | ||||||
L | finished | None | 1/3 | 1/3 | 1/3 | dataMon-Feb-25-11:44:28-2008 | movie | ||||||
M | finished | None | 1/4 | 1/2 | 1/4 | dataMon-Feb-25-12:08:13-2008 | movie | ||||||
N | finished | None | 1/4 | 1/2 | 1/4 | dataMon-Feb-25-14:55:59-2008 | movie | ||||||
O | finished | None | 1/4 | 1/2 | 1/4 | dataMon-Feb-25-17:58:15-2008 | movie | ||||||
P | unstable | None | 1/4 | 1/2 | 1/4 | dataMon-Feb-25-18:30:26-2008 | movie | ||||||
Q | running | None | 1/4 | 1/2 | 1/4 | dataMon-Feb-25-18:57:53-2008 | movie | ||||||
R | unstable | None | 1/4 | 1/2 | 1/4 | dataTue-Feb-26-10:13:19-2008 | movie | ||||||
S | unstable | None | 1/4 | 1/2 | 1/4 | dataTue-Feb-26-11:45:36-2008 | movie |
Realistic Solid Behavior
The following plot includes the curves for each of the 1D simulations (see Table).
It is clear that has more influence on the interface motion than during melting.
The next step is to try simulation H with the vapor added in 1D to figure out a satisfactory time step.
Also, it would be interesting to know how influences the interface motion.
Finite Size Effects
Having examined simulations I and K and compared them with B and D (see Table), it seems that there may be finite size effects in these simulations. The interface motions are way off. For example simulation B and I use the same parameter set yet the interface motion is over two orders of magnitude slower for case I. Some simulations need to be performed to ascertain whether there is an eventual convergence of the interface motion as the domain is expanded.
Table of Simulations
ID | status | data directory | movie link | |||||||
B | finished | None | 0.1 | dataWed-Feb-20-14:09:53-2008 | movie | |||||
I | finished | None | 0.1 | dataThu-Feb-21-18:08:24-2008 | movie | |||||
J | finished | None | 0.1 | dataFri-Feb-22-15:14:20-2008 | movie |
Chat with Jim
After a chat with Jim we both came to the realization that curves on a liquid fraction against time plot for varying domain sizes do not have to correspond at all. That raises the question about how to determine whether we have finite size effects. There does not seem to be an easy way to do this. Certainly plotting interface position as a function of time may be more enlightening. The velocity of the interface at early times should be independent, but at long times I don't think that is true.
Absolute Interface Position
It is clear from the figure that all our fears are unfounded and this whole episode was a waste of time. Anyhow, it is interesting to notice that there is more "ringing" in the larger domain.
High solid viscosity
Jim wants to understand why we cannot model a rigid solid interface with a high viscosity. He needs the story for his talk. Suggestion is to run a simulation with a very high viscosity in the solid and a liquid that is not at equilibrium to induce melting of the solid
Initial Condition
The initial condition is a phase fraction of 1/2 liquid and 1/2 solid. The solid is at the three phase equilibrium values, and . The initial densities in the liquid region are given by,
In this case . This induces melting of the solid. For the numbers that we use in the example the final liquid fraction is given by 0.55. Using this approach there is a lower bound for and thus an upper bound on the final liquid fraction.
Regular solid viscosity
When the solid behaves much like a liquid. The movie is given here for (dir: dataWed-Feb-20-10:37:23-2008). In this simulation there is no clear distinction between interface motion due to pressure equilibration and motion due to the concentration gradient. In simulations where the temperature is reduced there is a clear distinction between these regimes of interface motion. One issue is that the total interface motion is equivalent to the interface width. It is definitely a good idea to run simulations with a small interface width compared with the total interface motion.
Interestingly enough, we can run this simulation with no upper bound on time step other than CFL. The parasitic velocities eventually give an upper bound less than 1. Here is the movie (dir: dataWed-Feb-20-14:09:53-2008)
High solid viscosity
When the simulation shows similar interface motion. Here is the movie (dir: dataWed-Feb-20-14:36:18-2008). A simulation has also been run with a bounded (dir: dataWed-Feb-20-10:47:36-2008)
Liquid Fraction plot
The following plot is the liquid fraction against time for the whole matrix of simulations. The simulation with , and shows rather strange long time behavior. at long times, which could be too large.
Low diffusion in the solid
To get closer to real solid behavior, the value of can be reduced. In this movie, (dir: dataWed-Feb-20-18:22:20-2008).
Matrix of simulations
ID | status | data directory | movie link | |||||||
A | finished | 0.1 | dataWed-Feb-20-10:37:23-2008 | movie | ||||||
B | finished | None | 0.1 | dataWed-Feb-20-14:09:53-2008 | movie | |||||
C | finished | 0.1 | dataWed-Feb-20-10:47:36-2008 | movie | ||||||
D | finished | None | 0.1 | dataWed-Feb-20-14:36:18-2008 | movie | |||||
E | finished | 0.1 | dataWed-Feb-20-18:22:20-2008 | movie | ||||||
F | finished | None | 0.1 | dataThu-Feb-21-17:16:51-2008 | long short | |||||
G | finished | 0.1 | dataThu-Feb-21-17:40:48-2008 | movie | ||||||
H | finished | None | 0.1 | dataThu-Feb-21-17:46:23-2008 | movie | |||||
I | finished | None | 0.1 | dataThu-Feb-21-18:08:24-2008 | movie | |||||
K | finished | None | 0.1 | dataThu-Feb-21-18:11:13-2008 | movie |