stanchan
2017-07-03 19:52
has joined #community201707

greg
2017-07-03 22:07
Published v3.0.5 of DRP - It has doc updates and some UI updates.

2017-07-03 22:17
@ctrees I think you've cross into the "let's get on the phone" type of questions

2017-07-03 22:41
I'm good with skype ?

2017-07-03 22:42
sorry to delay you... needs to hold off for the holiday. want to work 1x1 to find times for Wednesday?

2017-07-03 22:46
oh I'm in no big hurry... plus it's motivating me to dig into the docs (I just finished a read through)...

ctrees
2017-07-03 23:30
has joined #community201707

alan.mcalexander
2017-07-05 21:17
has joined #community201707

ctrees
2017-07-11 20:36
Is the RackN team know about gns3 http://gns3.com ? what I'm attempting now is to create a network training platform so that the dev and ops team can re-use infrastructure AND the OpenDaylight controller

greg
2017-07-11 20:39
I?m not aware of it.

ctrees
2017-07-11 20:41
I'm almost thinking I could create components for each of the RackN docker containers and put them into a gns3 appliance template... the idea would be netops folks would be able to grok dev parts (new docker) while dev folks can better utilize all the cool dr inventory manangement

greg
2017-07-11 20:43
maybe - not sure.

ctrees
2017-07-11 20:46
When I was attempting to educate both dev and ops on why/how we should clean up the subnets, I started to build up the network model in gns3 so I could let them inspect the network traffic with wireshark.... the newer gns3 server is doing about the same thing (arch wise) as you were doing (making all services in containers)...

greg
2017-07-11 20:47
okay - I?d need to look at it and see.

ctrees
2017-07-11 20:48
anyway... If I get something useable I'll link it here... I'll attempt it, just when I dug into it's V2.0.x release it fits very well with your containers...

ctrees
2017-07-11 21:52
I think this is where I get hung up with dr install... based on:

ctrees
2017-07-11 21:52
curl -fsSL https://raw.githubusercontent.com/digitalrebar/digitalrebar/master/deploy/quickstart.sh | bash -s -- --con-provisioner --con-dhcp --admin-ip=1.1.2.3/24

ctrees
2017-07-11 21:52
from the github quickstart...

ctrees
2017-07-11 21:53
should I first set a static IP on the 'clean' ubuntu that ONLY has the one route to gateway then use that static as the admin-ip ?

ctrees
2017-07-11 21:56
in reality I am putting this in a nested vm using vmare with an ubuntu that runs qemu and docker... which I'll attempt later to see if I can package your docker images into that... but for now, I am launching a vmware vm (ubuntu) to just run dr but let the gns3 manage the network (it is running the pfsense router withing a virtual network which can be nat'd out to internet

ctrees
2017-07-11 21:58
Just as RackN has a way to bring up qemu pxe clients, gns3 does the same thing but for cisco and other router os's

2017-07-11 22:00
OH.. that confused me abit... the 'Rob' bridge echo

greg
2017-07-11 22:24
admin-ip is what DR is going to advertise as itself when nodes try and talk aback to it.

greg
2017-07-11 22:24
Nodes should be able to talk to DR through that IP.

ctrees
2017-07-11 22:25
so it's has to hear broadcast traffic....

greg
2017-07-11 22:25
well - that is the DHCP aspect. It doesn?t if you have helpers that can direct to that IP. DHCP is a little different because of its L2 specific nature.

greg
2017-07-11 22:26
DHCP server will do the ?right? thing with regard to picking interfaces and IPs to respond within the DHCP protocol. The challenge is in the provisioner and beyond and that is controlled by the admin-ip.

ctrees
2017-07-11 22:26
which was where I think I got caught before, as I had multiple IP's and I couldn't figure out how it was listening or where in the scripts the route tables were added...

ctrees
2017-07-11 22:27
what I should go do is figure out why/how the forwarder service works... that'll probably explain it ??

greg
2017-07-11 22:28
ugh - the bane. We have a pull request that we haven?t finished to remove FORWARDER. I?m not overly fond of it for general use.

ctrees
2017-07-11 22:28
as my guess ?? is the forwarder is how you deal with virtual routes internally... ?

greg
2017-07-11 22:28
FORWARDER mode was mostly so we could run a system on a system without having to kill dnsmasq.

greg
2017-07-11 22:29
In general, it causes more pain than good.

ctrees
2017-07-11 22:29
not going to use it, but that sort of 'inception' thing is basically the onlything I get tripped up on...

greg
2017-07-11 22:30
I tend to do HOST for just about everything. FORWARDER only really works (you make force it) for linux boxes running isolated kvms that are also attached to the docker bridge.

greg
2017-07-11 22:30
Or bridging whole nics onto the docker bridge.

ctrees
2017-07-11 22:30
I'm hoping to setup 'real router' os simulations that then hook into the newer SDN stuff...

ctrees
2017-07-11 22:31
OH... so what's the 'do all' command line for HOST ? just leave off the admin-ip command ?

greg
2017-07-11 22:32
The model I?ve been preferring is that you shouldn?t really think of DR as a collection of services running in docker, but a single system that provides some endpoints. HOST mode does this view better.

ctrees
2017-07-11 22:32
.... never-mind... I need to just figure it out... and your scripts have it all when I dig... your just way to 'flexable' :wink:

ctrees
2017-07-11 22:33
OK... so if I do a VM (clean install of ubuntu) which do-it-all script should I run and how should I bring up test pxe (in quem or other vms)

ctrees
2017-07-11 22:35
Oh... the github quick dangerous was hostmode... so I'm good

greg
2017-07-11 22:35
yes

ctrees
2017-07-12 15:59
In a 'test-lab' situation, is it better to have 2 network cards and let the host do the route/nat/firewall also ? aka the admin-ip becomes the gateway ?

ctrees
2017-07-12 16:02
basically I'm thinking if I tell them this replaces the 'soho router' for testing... they'll grok it faster...

greg
2017-07-12 16:42
That is how I kvm test - it also lets me then do isolated testing with DR as webproxy and without.

ctrees
2017-07-12 16:46
so you do that 'with a vm' and 'on real' h/w... it's the nesting of vm and the network stack as seems like lots of tools are now attempting to 'help' make adjustments... netstat -rn is showing me lots of adjustments... (more from VMware, VirtualBox and GNS3... so it's not really anything to do with DR, but through the DR scripts I'm finding out how you guys deal with those situations)

ctrees
2017-07-12 16:52
Say... while I'm on my little network inception mind-bend... would you just PXE boot to recycle or use Mesos ? or go kubernetes... I know it 'depends' but was wondering when I watched your packet demo's how you would add additional packet servers to the test cluster (as you were just adding quem vm's in that video demo)

ctrees
2017-07-12 16:56
seems like you guys have basically tested them all... seems like going back down to metal with PXE is the cleanest for recycle... I eventually want to attempt to recycle to move equipment to a new resource pool which I'm pretty sure you guys have done

greg
2017-07-12 17:08
We usually recommend a complete rebuild. Mesos should work because it handles dynamic works. K8S is ideal. workers are supposed to be replaceable.

2017-07-12 17:10
so that's why the drive to K8S demos then

greg
2017-07-12 17:16
well - it is also becoming more popular than Mesos. It appears. Lots of movement there.

ctrees
2017-07-12 19:38
So is Goiardi eventually going away ? which container drives the Annealing process ? I take it server state status is keep in the protgres db then changes are change events that trigger the annealing... just not sure what service does it

greg
2017-07-12 19:40
rebar-api drives annealing process.

greg
2017-07-12 19:40
goiardi is a go-based chef server. It is used by some roles and currently won?t go away for a while.

greg
2017-07-12 19:41
rebar-api is a rails app that handles the API layer (mostly) and a set of worker threads do annealing.

ctrees
2017-07-12 19:48
I noticed the consul, so I was wondering... you must have written the rebar-api pre terraform ? (more evolution curiosity is all...)

greg
2017-07-12 19:51
yes, but rebar-api does a lot more that terraform ever will and less at the same time.

ctrees
2017-07-12 19:52
more the hope to avoid ruby... at least hashi started to put newer stuff in Go

greg
2017-07-12 19:52
Well - the core of that rebar-api was started 8 years ago.

greg
2017-07-12 19:53
We are moving it to go overtime.

ctrees
2017-07-12 19:53
which I know you guys are doing also (Go)... yea and you were ops guys so hard to avoid ruby 8 year...

greg
2017-07-12 19:54
we needed an API endpoint that was the UI as well. That pretty much meant rails or a really bad state of django or some java thing.

greg
2017-07-12 19:55
I find terraform confounding. It is good but unbounded. Much like ansible.

greg
2017-07-12 19:57
You can do anything and everything and so people do and there is very low repeatability, testability, and abstraction. It makes it really hard to isolate problems or operations.

ctrees
2017-07-13 16:35
Clean install on Ubuntu VMWare, 2 network (host only and a bridge to internet)

ctrees
2017-07-13 16:35
TASK [Update repos (was not working from apt:)] ****************************************************************************************************************** [WARNING]: Consider using apt module rather than running apt-get fatal: [172.16.240.2]: FAILED! =>

ctrees
2017-07-13 16:36

ctrees
2017-07-13 16:36
Digital Rebar UI https://172.16.240.2 cat@ubuntu:~/digitalrebar/deploy$ sudo ./run-in-system.sh --deploy-admin=local --access=HOST --admin-ip=172.16.240.2

greg
2017-07-13 16:37
Admin IP needs a CIDR

ctrees
2017-07-13 16:39
sorry... what's CIDR ? ? env thing ?

ctrees
2017-07-13 16:41
or is that setup in the startup... I cold just reset... and use quickstart... I didn't realize quickstart need 6GB (had 4GB set)...

ctrees
2017-07-13 16:42
but I saw the attempt at root in the past... thinking it's an ssh key / ansible thing


ctrees
2017-07-13 16:44
OK... --admin-ip=172.16.240.2/24

ctrees
2017-07-13 16:49
failure on same step...

ctrees
2017-07-13 16:49
cat@ubuntu:~/digitalrebar/deploy$ netstat -rn Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 0.0.0.0 192.168.9.1 0.0.0.0 UG 0 0 0 ens33 172.16.240.0 0.0.0.0 255.255.255.0 U 0 0 0 ens38 192.168.9.0 0.0.0.0 255.255.255.0 U 0 0 0 ens33

ctrees
2017-07-13 16:50
cat@ubuntu:~/digitalrebar/deploy$ ifconfig ens33 Link encap:Ethernet HWaddr 00:0c:29:42:6b:8e inet addr:192.168.9.62 Bcast:192.168.9.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe42:6b8e/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:6618 errors:0 dropped:0 overruns:0 frame:0 TX packets:361 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:2917977 (2.9 MB) TX bytes:34729 (34.7 KB) ens38 Link encap:Ethernet HWaddr 00:50:56:39:54:5f inet addr:172.16.240.2 Bcast:172.16.240.255 Mask:255.255.255.0 inet6 addr: fe80::250:56ff:fe39:545f/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1563 errors:0 dropped:0 overruns:0 frame:0 TX packets:1166 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:122905 (122.9 KB) TX bytes:441659 (441.6 KB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:301 errors:0 dropped:0 overruns:0 frame:0 TX packets:301 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:36506 (36.5 KB) TX bytes:36506 (36.5 KB)

greg
2017-07-13 19:33
yes

ctrees
2017-07-14 17:30
So I got the quick install running on Ubuntu (under VMWare as it supports nested virtualization) I added another VMWare VM set it to PXE boot, and it did... (the node showed up in the DHCP with it's MAC and IP) but when it rebooted, the VM image seemed to have lost the network card used during the PXE boot. The Host system still sees the IP and the route but I don't think sledgehammer can report info to DR.

ctrees
2017-07-14 17:32

ctrees
2017-07-14 17:34
I suspect it's a VMWare thing just wondering if you've ever seen this issue.

ctrees
2017-07-14 18:22
Tried the 'devices.hotplug = "FALSE" ' .vmx mod solution


ctrees
2017-07-14 18:23
no happiness...

greg
2017-07-14 18:32
You need to make sure that the mac is set and not changing. Also, the client vm should have only one NIC in one network.

ctrees
2017-07-14 18:35
Yup... I set MAC static AND only one NIC... again I'm pretty sure it's VMWare dropping the darn thing, just wondering if you've ran into it.

ctrees
2017-07-14 18:39
The only reason I'm doing the VM stuff is so the devs and ops guys can review each-others mods on an isolated system... I'm replicating the same setup in the C7000 blade center right now... I don't expect the have an issue with real metal as I've seen it work already. The C7000 blade setup had issues with it's control plane wanting / defaulting to DHCP and need that to go access the VLAN stuff... all this inception stuff for isolation

ctrees
2017-07-14 22:07
So the best way to bring a whole grid down is digitalrebar/deploy/backup.sh then bring back up with digitalrebar/deploy/restore.sh


ctrees
2017-07-14 22:08
In backup.sh has a comment that I'm not following...

ctrees
2017-07-14 22:08
# This script handles getting the first 5 items, getting the third is # left as an exercise for the reader. #community

ctrees
2017-07-14 22:09
it name 3 data sets, then 5 items, then mention 'the third' is left for the reader ?

zehicle
2017-07-14 23:18
"The file data for Goiardi" - we don't recover Chef information

zehicle
2017-07-14 23:19
but... it looks like it does in the script. It's possible that comment is out of date

ctrees
2017-07-17 15:56
I viewed the new Packet IPXE Test w DRP Endpoint RackN & Digital Rebar video. What is the best source to setup the DHCPd Options and/or DNSMasq Options to emulate how the PXE hand-off ( as seen in video timecode: 4:22 )

ctrees
2017-07-17 16:11
Following that hand-off sequence seems to be key to putting DR into an existing PXE setup. I found: http://provision.readthedocs.io/en/stable/doc/arch/data.html#rs-dhcp-models - 6.3.2.1 Subnet, but also remember seeing a more detailed DR doc about the PXE process, but cannot find it again.

ctrees
2017-07-17 16:44

ctrees
2017-07-17 16:57
I take it the 'custom iPXE' option basically maps a 'Request Next Boot' to the provisioned mac I think I'm just struggling with what the first PXE boot is on the main DNS... I am just assuming on the packet side they are setting up a VLAN but maybe not... it all could be my weakness in following PXE... so your new video showed how the packet server was bouncing through a few NEXT BOOT (I think)

ctrees
2017-07-17 19:05
OH... I think I get it now: 1 - Server gets IP 147.75.90.81 (very small subnet) via DHCP at 147.75.200.3 and asks for PXE (video TC: 4:19) 2 - DCHP server sends http://147.75.200.3/auto.ipxe (http://packet.net Baremetal boot image i'm guessing) 3 - (guessing as video refresh hits) auto.ipxe has 'next server' set to what was entered in the iPXE config: http://packet.rebar.digital/default.ipxe 4 - Server (PXE proto) looks for which does not exist... so it uses http://packet.rebar.digital/default.ipxe to find the next server 5 - At this point the server does 'goto sledgehammer' which points to http://147.75.73.23:80/sledgehammer/b3c09ebd5a9c228c66d8a617b6f5d10ccbe1c273/vmlinuz0 6 - THEN this image and it's control string adjusts the console... and loads the stage1.img of sledgehammer 7 - I assume now that's the discovery image but it's going to REPORT findings back to 147.75.73.23 ?

greg
2017-07-17 19:09
yes

greg
2017-07-17 19:09
That is the flow

ctrees
2017-07-17 19:12
Thanks... so sometimes when I load up DR, I don't see the docker ports mapped to Host ports... which part of the Ansible does that docker mapping, or can those run out-of-order ?

greg
2017-07-17 19:12
it is part of compose and docker

greg
2017-07-17 19:12
Ansible runs it


greg
2017-07-17 19:18
yep

wdennis
2017-07-18 16:45
@greg About ready to try an DRP upgrade from v3.0.3 to v3.0.5 - just follow http://provision.readthedocs.io/en/stable/doc/upgrade.html right?

greg
2017-07-18 16:51
Yes - I believe that most of the changes are all internal for those releases and docs.

wdennis
2017-07-18 16:54
Cool

wdennis
2017-07-18 16:57
Also, why do I have to do: `../drpcli bootenvs install bootenvs/[...]` instead of `./drpcli bootenvs install assets/bootenvs/{...}` ???

greg
2017-07-18 16:58
lame answer (kinda): because the install command assumes that there is an isos and templates directory that are peers to bootenv at the cwd of execution.

wdennis
2017-07-18 16:58
ah, thought that might be it

greg
2017-07-18 16:59
It needs to import isos (store isos) and import templates (if not found).

wdennis
2017-07-18 17:03
I just always forget that I have to cd into ?assets? before I do that

greg
2017-07-18 17:05
we are working on some big changes to how content will work and be viewed and distributed. The goal is to make it easier to track, update, and display. We?ll see if we succeed at that, but ..

wdennis
2017-07-18 17:05
On DRP specifically you mean? Or full DR

greg
2017-07-18 17:08
DRP

ctrees
2017-07-19 20:12
So... can I run the DRP with DR ? (I assume DRP is the new tip of what was the Provisioner container) ? digitalrebar/dr_provisioner

greg
2017-07-19 20:13
not really. There are some tricky join actions and things are not that simple. Yes, DRP replaces provisioner and dhcp containers, but it doesn?t integrate cleanly.

ctrees
2017-07-19 20:15
So workloads or not to workload is the divide between them (DR vs DRP) ?

ctrees
2017-07-19 20:16
Or is DRP sole purpose is crowbar update ?

ctrees
2017-07-19 20:18
woops... crowbar -sb-> Cobbler

greg
2017-07-19 20:19
workloads are currently DR only. DRP may have something similar one day or we may plug it into DR. The problem is that most people find DR is too complex for their needs and DRP with a little more. Cobbler with a few more features.

greg
2017-07-19 20:20
The plan is to get DRP into DR (or replace DR with a smaller thing that uses DRP) at some point. We are working to get DRP fully functional to what we want first.

ctrees
2017-07-19 20:29
Workloads REQUIRES DR makes sense... thanks!

2017-07-19 23:45
Got a quick question setup digital rebar provision and when it tries to upload any iso into digital rebar provision i get a context deadline exceeded - any ideas, I just followed the quickstart no other changes

greg
2017-07-19 23:51
Does DRP have access to the internet?

2017-07-19 23:52
ya im able to download the iso

greg
2017-07-19 23:52
Are you running as root?

2017-07-19 23:52
no let me try doing it that way one sec

greg
2017-07-19 23:52
well wait.

greg
2017-07-19 23:53
did you do ```sudo ./dr-provision ....```

greg
2017-07-19 23:53
?

2017-07-19 23:53
yes

greg
2017-07-19 23:53
Do you have passwordless sudo and did you put it in the bg with &

greg
2017-07-19 23:53
because sometimes - drp hangs waiting for a password to run it.

2017-07-19 23:53
no passwordless sudo, i had to put in the password

greg
2017-07-19 23:53
okay

greg
2017-07-19 23:54
does ```drpcli bootenvs list``` return almost immediately?

2017-07-19 23:54
let me try

2017-07-19 23:56
ya instant

greg
2017-07-19 23:57
ok

greg
2017-07-19 23:57
thinking

greg
2017-07-19 23:57
What command are you running that fails?

2017-07-19 23:58
../drpcli bootenvs install bootenvs/blahblah.yml

2017-07-19 23:58
well not the blah blah but u get the picture

2017-07-19 23:58
any of the yml files

2017-07-19 23:59
it downloads the isos, and they do show in the isos folder but when it tries the upload step it fails with that error

greg
2017-07-19 23:59
What version are you using? stable/default or tip?

2017-07-20 00:00
then one from the quickstart

greg
2017-07-20 00:00
okay - stable if you didn?t add ```--drp-version=tip```

2017-07-20 00:00
ya stable

greg
2017-07-20 00:00
Sooo - I?m not sure. What are you running on? CPU, memory, and disk?

greg
2017-07-20 00:01
Do you have enough space is the real question.

2017-07-20 00:01
hyper-v ubuntu vm 100gb of hd space for the vm

greg
2017-07-20 00:01
ok - should be okay.

2017-07-20 00:01
16.04 version of ubuntu

greg
2017-07-20 00:01
next cheat to get around this problem. I?ll have to try it to be sure, but was working yesterday.

greg
2017-07-20 00:02
Is this a custom bootenv you built or one of the defaults?

2017-07-20 00:02
default

greg
2017-07-20 00:02
we?ve only really been testing centos7 or ubuntu16.04 bootenvs.

greg
2017-07-20 00:02
Not sure if the others will work.

2017-07-20 00:03
i tested 14.04 and that one worked but when it booted it couldent reach the repo to download the rest of the install

2017-07-20 00:03
the other ones gave me the error even centos

greg
2017-07-20 00:03
Yes, your vms/nodes have to have internet access to use the ubuntu images.

greg
2017-07-20 00:03
hmm - okay.

greg
2017-07-20 00:04
in the provision directory where you installed. There should be a drp-data directory

greg
2017-07-20 00:04
inside that, you have tftpboot/isos

greg
2017-07-20 00:04
that is where the iso go.

greg
2017-07-20 00:04
can you check that directory? and look at the contents.

2017-07-20 00:05
ubuntu is there well the 14.04 one

2017-07-20 00:05
and sledgehammer

2017-07-20 00:05
sledgehammer works by the way

greg
2017-07-20 00:06
cool

greg
2017-07-20 00:06
hmmm - drpcli bootenvs show <bootenv in question>

greg
2017-07-20 00:06
does the errors field have anything?

2017-07-20 00:07
let me see

greg
2017-07-20 00:07
It seems like space or quotas or something.

greg
2017-07-20 00:07
since you?ve uploaded some,but not this one.

greg
2017-07-20 00:08
you can copy the iso from the assets/isos dir inth the tftpboot/isos directory and then update the bootenv.

greg
2017-07-20 00:08
drpcli bootenvs update - < bootenvs/<bootenv filename>

greg
2017-07-20 00:08
I think that will ?explode? the iso for that bootenv.

2017-07-20 00:09
so if it was lets say ubuntu 16.04 it would be drpcli bootenvs update -<bootenvs/ubuntu-16.04 ?

2017-07-20 00:09
ls

2017-07-20 00:10
oops

greg
2017-07-20 00:10
```drpcli bootenvs update ubuntu-16.04-install -<bootenvs/ubuntu-16.04```

2017-07-20 00:10
k i will give that a try

2017-07-20 00:11
thanks

greg
2017-07-20 00:11
otherwise, I?m running out of options. Stopping and starting drpcli will also explode isos, I think.

greg
2017-07-20 00:11
I need to step away for a while. back in an hour or so

2017-07-20 00:11
ill try running as root and see what happens as well

2017-07-20 00:11
no worries thanks

zehicle
2017-07-20 02:46

zehicle
2017-07-20 02:46
I have the right ISO and can use the CLI to upload. How do I know what to name it?

greg
2017-07-20 03:23
The file knows

greg
2017-07-20 03:24
VMware-VMvisor-Installer-201701001-4887370.x86_64.iso

zehicle
2017-07-20 03:31
thanks, I thought it was the name

zehicle
2017-07-20 03:33
that fixed it

2017-07-20 03:37
I was able to get around the error by uploading the iso to the tftpboot/isos dir and then running the bootenvs command

2017-07-20 03:38
was able to run the ubuntu install fully as well just had to change the dns to my router - for some reason it would not work using the digital rebar provision vm as the gateway or dns

zehicle
2017-07-20 03:38
could have to do w/ how you configured the subnets

2017-07-20 03:39
probably ya

greg
2017-07-20 03:56
Yeah - we don?t webproxy with DRP.

zehicle
2017-07-20 03:58
RE on ESXi installs.... you need to ensure min RAM / CPU requirements

zehicle
2017-07-20 03:58
but mine is still hanging on lsu_lsi install part

zehicle
2017-07-20 03:59
@jj did you get past that?

jj
2017-07-20 13:33
yeah it did when i was testing it

2017-07-22 00:35
https://t.co/fNGrXZGneF third video is not the right link ;)

2017-07-22 00:35
on the packet blog post

zehicle
2017-07-22 02:33
oops! I'll let them know

zehicle
2017-07-27 19:24
officially working on t-shirt designs.... check out http://99d.me/c/fwje

2017-07-27 21:31
hi

2017-07-27 21:32
Can't pass the following in the quickstart: TASK [Link dirs] **************************************************************************************************************************************************************** fatal: [192.168.2.183]: FAILED! => {"changed": false, "failed": true, "gid": 0, "group": "root", "mode": "0755", "msg": "refusing to convert between directory and link for /home/administrator/digitalrebar/deploy/compose/digitalrebar", "owner": "root", "path": "/home/administrator/digitalrebar/deploy/compose/digitalrebar", "size": 4096, "state": "directory", "uid": 0} to retry, use: --limit @/home/administrator/digitalrebar/deploy/digitalrebar.retry

greg
2017-07-28 00:45
not sure how you are running it, but you can remove /home/administrator/digitalrebar/deploy/compose/digitalrebar it should be a link . if it isn?t remove the directory. if it is a link, remove it. and rerun.

2017-07-28 11:28
Hmm Ill debug this further

2017-07-28 19:04
I'm trying to bring up a vagrant node and join an existing vagrant base box to test with and am having difficulty running the join_rebar.sh.

2017-07-28 19:05
when running the curl commands from the script i'm unable to connect to admin node port 3000

2017-07-28 19:06
I'm able to access the gui for digital rebar with out any issues... just need to join an existing node

greg
2017-07-28 19:10
yeah - that script is old. It wasn?t updated when we unified all API accesses through the auth proxy. I would say, you just have to modify the script to not use port 3000, but I don?t think that is all that needs to be run. Which join_rebar.sh script are you using?

2017-07-28 19:11
Yea i changed to point to https://admin.node.ip. and all works except for anything with a PUT

2017-07-28 19:11
so I can't register the node

zehicle
2017-07-28 19:11
unless you have a long term plan for Vagrant, I'd suggest just using VMs to test. Vagrant does not handle the PXE boot

2017-07-28 19:12
gotcha I was just going to run some local deployments non baremetal if possible.

zehicle
2017-07-28 19:13
I understand, I was hoping the Vagrant path would allow faster testing cycles when we did that work by avoiding PXE cycles. Turned out that it was faster to just work against cloud VMs for that use case

zehicle
2017-07-28 19:13
(we added the provider at that point)

zehicle
2017-07-28 19:14
what are you trying to validate? it's possible that Provision may be sufficient (and simpler)

2017-07-28 19:15
yes, I just didn't have a public cloud available to test with, was trying to validate on a laptop as I've been using kismatic to deploy k8s locally in hopes testing a pipeline before moving to a cloud

2017-07-28 19:16
just trying to see if this is something I can bring to our company I'm in the beg. stages of building a new datacenter

2017-07-28 19:16
and so far its exactly what I'm looking for

zehicle
2017-07-28 19:17
:slightly_smiling_face:

mniemann
2017-07-31 19:14
has joined #community201707

2017-07-31 22:44
@zehicle you here?

zehicle
2017-07-31 22:44
Yes

2017-07-31 22:45
The quickstart installation doesn't seem to work after a reboot of the host holding all the docker-compose containers

2017-07-31 22:45
I get this in the rev_proxy container: 2017/07/31 22:40:39 Request failed: Get https://172.17.0.12:3000/api/v2/users/rebar/digest: x509: certificate signed by unknown authority (possibly because of "x509: ECDSA verification failure" while trying to verify candidate authority certificate "internal")

2017-07-31 22:45
-- when trying to login

2017-07-31 22:46
when did you get the containers?

2017-07-31 22:47
as in when did I run the checkout?

2017-07-31 22:47
-- or run the quickstart itself?

2017-07-31 22:47
the first install - I think there was an issue w/ consul in the containers from about a week ago

2017-07-31 22:47
no this is pretty fresh

2017-07-31 22:48
ok

2017-07-31 22:48
are you trying to re-run the quickstart or just that the containers don't restart automatically after a reboot?

2017-07-31 22:49
the containers restarted after a reboot

2017-07-31 22:49
didn't rerun quickstart. I was really relieved that it finally worked, only to fined that the default didn't work.

2017-07-31 22:50
default= default credentials

2017-07-31 22:52
I really want this to work, because I want to see DR work, but I've already spent so much time on this. I've also already filed an issue on Github (bit of a rant, maybe I should edit it).

2017-07-31 22:53
Is there a good reason that DR is so incredibly resource hungry?

2017-07-31 22:58
a couple of things...

2017-07-31 23:00
1) sorry about the issues, DR has a lot of moving parts. the container packaging helps a lot but it's still complex. that's why we suggest working w/ us to build much more than a quick start

2017-07-31 23:01
2) DR has a lot of parts including a fully Ruby stack, a postgresql database, chef server and other services. so there's a lot going on there. that's why we're actively rewriting it in golang

2017-07-31 23:02
3) We're suggesting people start with DR Provision as a first stage at this point. It's much (much) lighter weight and simpler to use. No docker is required at all and the functions are easier to understand

2017-07-31 23:03
I saw DR provision, but it was the more advanced stuff I was interested in, e.g., deploying Kubernetes and Ceph, and initially provisioning through PXE in a private cluster

2017-07-31 23:03
4) Inside of DR Provision, we've been building a workflow system that simpler to use for the use cases that we hear about the most. We're also building new UX that makes it easier to manage multiple sites

2017-07-31 23:04
for those uses cases, we're putting in Ansible & Terraform integrations. Our Kubernetes work was just Kubespray anyway. Over the weekend, we added a dynamic inventory generator for Ansible with the plan to document using that as the Kubespray target

2017-07-31 23:07
that approach does not leverage DR's hybrid provider concept. While we think that function is very important long term, it has been getting in the way of people exactly like you who just want to get started quickly. For that reason, we're more excited about Provision as a fast and easy win for physical data centers.

2017-07-31 23:08
The new Provision Jobs/Tasks work (just pulled today by @galthaus ) creates a huge amount of capability to do workflows like burn-in, raid/bios, discovery & auto-classification. It's going to take a while to explain how it all works

2017-07-31 23:09
interesting stuff, eager to see how it will work out. I really think you guys are filling a gap at least somewhere in the whole hybrid-or-not cloud provisioning.

2017-07-31 23:09
TL;DR - we're strong recommending Provision as a starting point

2017-07-31 23:09
That may meet all your needs AND you can hook it into the other DR work later too

2017-07-31 23:09
Will do, but will I succeed at this point with DR Provision with the aim of priviosning/deploying ceph and kubernetes?

2017-07-31 23:10
of I course I can fill in small gaps, but is the workflow stuff ready for it?

2017-07-31 23:16
We have not integrated K8s to DR P yet. For Ceph, I'd recommend Rook (https://blog.rook.io) on top of the cluster

2017-07-31 23:17
the integration will be pretty simple "put theses profiles on the nodes you want to take on these roles, add params to the profiles to overide the defaults"

2017-07-31 23:17
ansible run w/ dynamic inventory.

2017-07-31 23:18
sadly, not as easy as the DR wizard approach; however, it's more inline with the way we see people trying to use Kubespray right now.

2017-07-31 23:19
we have some interesting plans to use DR P for joining nodes to the the cluster using the 1.7 node admission workflow.

2017-07-31 23:20
I think I like your above approach and will try it this week in a very small cluster

2017-07-31 23:21
btw, Rook seems rather inception-y :D, since it's deployed as containers, so that other containers can have persistent storage

2017-07-31 23:21
yes. So far, I've heard good things about it. Our other recent Ceph work was also containers using Helm via OpenStack.

2017-07-31 23:21
Thanks for your info, it was pretty enlightening

2017-07-31 23:22
I have to go though. Thanks again!

2017-07-31 23:23
you're welcome. We've been working hard to bring up all the DR P key functionality to make it easier to start there. We got the basics in place and are adding the gee wiz stuff now.